Skip to content

Flow analysis preliminaries #473

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

leafpetersen
Copy link
Member

This is not for submittal, just using this as a place for discussion of the setup for flow analysis, and the policy choices.

is statically known to flow analysis about the state of the program at a given
point in the source code.

- `reachable` is a stack of boolean values modeling the reachability of the
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what's going on here? This is my attempt to incorporate Paul's extended reachability into the basic framework. I think this is roughly what the extended reachability was doing, except that instead of keeping the stack of previous reachability values on the call stack, I'm keeping it explicitly in the analysis. An implementation could do either. So far, I've found the book-keeping a little cleaner here. Note that I'm possibly applying this in more places than Paul had originally intended? For example, I think the following is a motivating example:

Object x;
if (false && x is int) {
  x.isEven
}

I'm not sure whether Paul intended this to promote in the body or not, but it seemed reasonable, and I think if I have this set up right, then we will get that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the way you've phrased this paragraph, and I'm inclined to try to adjust the implementation to keep the stack explicitly in the FlowModel just for clarity. I'll add a test case for this example and see if it works :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - glad that phrasing this as a stack agrees with your intuition. My text was not completely accurate though - the elements of the stack really describe the reachability of a single link in the chain of control flow splits. I've updated the text to be more precise, and hopefully still conveying the intuition.

FlowModel(r2, VI2))` and `pop(r1) = pop(r2) = r0` for some `r0` to be `M3` where:
- if `top(r1)` is true and `top(r2)` is false, then `M3` is `FlowModel(r0, VI1)`.
- if `r1` is false and `r2` is true, then `M3` is `FlowModel(pop(r2), VI2)`.
- otherwise `M3 = FlowModel(r3, VI3)` where:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% clear whether I'm doing the right thing here. It's clear what to do if one of the branches is provisionally reachable and the other is not, and it's clear what to do if both are provisionally reachable. But what about if both are not provisionally reachable? My guess here is that the right thing to do is to treat them both as provisionally reachable - that is, keep both of their states. But I'm not 100% sure about that.

  foo(Object x, bool b) {
    throw "Unimplemented";
    if (b) {
      if (x is! int) return;
      throw "Unimplemented";
   } else {
     if (x is! int return;
     throw "Unimplemented";
  }
  x.isEven; // should work?
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me. It's a weird corner and I doubt it will come up very frequently, but what you're proposing seems like it will be fairly well behaved.


Promotion policy is defined by the following operations on flow models.

Policy:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried to capture the policy options for assignment promotion that were proposed here here. What other options should I include?

type `T` given variable model `VM` if
- `VM = VariableModel(declared, promoted, sites, assigned, unassigned, captured)`
- and `captured` is false
- and promoted contains `T`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allow demotion to anything that we had previously promoted to? Probably also need to include the declared type here.

@leafpetersen
Copy link
Member Author

cc @lrhn @johnniwinther @scheglov @eernstg @munificent @stereotype441

I'm starting to work on pulling together Paul's proposal and the various discussions that have been going on into a spec for the flow analysis based type promotion etc. This is some initial machinery and policy discussion that I thought would be useful to get some early feedback on.

Copy link
Member

@stereotype441 stereotype441 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a great start! Thanks Leaf!

is statically known to flow analysis about the state of the program at a given
point in the source code.

- `reachable` is a stack of boolean values modeling the reachability of the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the way you've phrased this paragraph, and I'm inclined to try to adjust the implementation to keep the stack explicitly in the FlowModel just for clarity. I'll add a test case for this example and see if it works :)

- `drop(M)`, where `M = FlowModel(r, VM)` is a flow model which models program
nodes after a control flow split which are only reachable by one path through
the split, and is defined as `FlowModel(r1, VM)` where `r0 = pop(r)` and `r1 =
push(pop(r0), top(r0) && top(r1))`. Equivalently, `drop(M)` may be thought of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean r1 = push(pop(r0), top(r0) && top(r)).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

push(pop(r0), top(r0) && top(r1))`. Equivalently, `drop(M)` may be thought of
as `join(M, M)`.

- `join(M1, M2)`, where `M1` and `M2` are flow models, represents the union of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider defining join before drop, since drop refers to join.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

We define `join(M1, M2)` where `M1 = FlowModel(r1, VI1)` and `M2 =
FlowModel(r2, VI2))` and `pop(r1) = pop(r2) = r0` for some `r0` to be `M3` where:
- if `top(r1)` is true and `top(r2)` is false, then `M3` is `FlowModel(r0, VI1)`.
- if `r1` is false and `r2` is true, then `M3` is `FlowModel(pop(r2), VI2)`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you mean "if top(r1) is false and top(r2) is true"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

FlowModel(r2, VI2))` and `pop(r1) = pop(r2) = r0` for some `r0` to be `M3` where:
- if `top(r1)` is true and `top(r2)` is false, then `M3` is `FlowModel(r0, VI1)`.
- if `r1` is false and `r2` is true, then `M3` is `FlowModel(pop(r2), VI2)`.
- otherwise `M3 = FlowModel(r3, VI3)` where:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me. It's a weird corner and I doubt it will come up very frequently, but what you're proposing seems like it will be fairly well behaved.

probably managable. I think doing downwards inference using the current
type, and then promoting the variable afterwards is fine for all reasonable
cases.
- The interaction between assignment based demotion and downwards inference is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a typo here? The previous bullet says the interaction between assignment based promotion and downwards inference is "probably manageable", and now we're saying that the same interaction is "a bit trickier".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the first bullet says "promotion" the second says "demotion". Added bolding to emphasize the difference between the two bullets.


- if a variable is tests before it is initialized, we must choose whether to
honor the type test or the assignment. Above I've chosen to prefer type
test based promotion. Examples:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

- `notNull(N) = join(true(N), false(N))`.
- `after(N) = notNull(N)`.

If `N` is an expression, and the above rules specify the value to be assigned to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should be and the following rules, not and the above rules

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

assigned to `true(N)` and `false(N)`, but do not specify values for `null(N)`,
`notNull(N)`, or `after(N)`, then they are by default assigned as follows:
- `null(N) = exit(after(N))`.
- `notNull(N) = join(true(N), false(N))`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this has to be notNull(N) = join(split(true(N)), split(false(N)))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. I think maybe I need a couple of additional combinators.

- Let `before(S1) = split(true(E))`.
- Let `before(S2) = split(false(E))`.
- Let `after(N) = join(after(S1), after(S2))`.
- Let `true(N) = join(true(S1), true(S2))`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since statements don't have values, we shouldn't need true(N), false(N), null(N), or notNull(N). Just after(N) is sufficient.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Member Author

@leafpetersen leafpetersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments! I've incorporated feedback so far into the main document. Sounds like we're roughly on the same page. I'll leave this open for a bit longer in case anyone else has feedback.

- `drop(M)`, where `M = FlowModel(r, VM)` is a flow model which models program
nodes after a control flow split which are only reachable by one path through
the split, and is defined as `FlowModel(r1, VM)` where `r0 = pop(r)` and `r1 =
push(pop(r0), top(r0) && top(r1))`. Equivalently, `drop(M)` may be thought of
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

push(pop(r0), top(r0) && top(r1))`. Equivalently, `drop(M)` may be thought of
as `join(M, M)`.

- `join(M1, M2)`, where `M1` and `M2` are flow models, represents the union of
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

We define `join(M1, M2)` where `M1 = FlowModel(r1, VI1)` and `M2 =
FlowModel(r2, VI2))` and `pop(r1) = pop(r2) = r0` for some `r0` to be `M3` where:
- if `top(r1)` is true and `top(r2)` is false, then `M3` is `FlowModel(r0, VI1)`.
- if `r1` is false and `r2` is true, then `M3` is `FlowModel(pop(r2), VI2)`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

- if `top(r1)` is true and `top(r2)` is false, then `M3` is `FlowModel(r0, VI1)`.
- if `r1` is false and `r2` is true, then `M3` is `FlowModel(pop(r2), VI2)`.
- otherwise `M3 = FlowModel(r3, VI3)` where:
- `r3` is `push(pop(r0), top(r0) && top(r1) && top(r2))`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added definition of terms. The problem with using a::b is that there's not really a concise notation for top and pop.

probably managable. I think doing downwards inference using the current
type, and then promoting the variable afterwards is fine for all reasonable
cases.
- The interaction between assignment based demotion and downwards inference is
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the first bullet says "promotion" the second says "demotion". Added bolding to emphasize the difference between the two bullets.


Policy:
- We say that at type `T` is a type of interest for a variable `x` in a list
of test sites `sites` if `sites` contains `x is T` or `x as T`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should handle this. I think your latter definition means that the following would be valid:

Object x;
if (x is int?) {
  x = 3;  // promote to int, since int? is of interest
  x.isEven;
  x = null; // demote to int?, since int? is of interest
}

Note sure whether that's a good thing or a bad thing? I'll think about it.

- otherwise if `x` is demotable via assignment of `E` given `VM`
- `VM = VariableModel(declared, demoted, sites, true, false, captured)`.
- where `demoted` is the suffix of `promoted` starting with the first
occurrence of `T`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the demotion criteria to demote to the first supertype of the assigned type, which I think addresses your first example. Still thinking about your second.

assigned to `true(N)` and `false(N)`, but do not specify values for `null(N)`,
`notNull(N)`, or `after(N)`, then they are by default assigned as follows:
- `null(N) = exit(after(N))`.
- `notNull(N) = join(true(N), false(N))`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. I think maybe I need a couple of additional combinators.

- `notNull(N) = join(true(N), false(N))`.
- `after(N) = notNull(N)`.

If `N` is an expression, and the above rules specify the value to be assigned to
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

- Let `before(S1) = split(true(E))`.
- Let `before(S2) = split(false(E))`.
- Let `after(N) = join(after(S1), after(S2))`.
- Let `true(N) = join(true(S1), true(S2))`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

dart-bot pushed a commit to dart-lang/sdk that referenced this pull request Aug 7, 2019
This makes the terminology consistent with
dart-lang/language#473.

No change other than renaming the class, sorting the file, and
reformatting some comments.

Change-Id: I81ab759decd0495f18d2afbe9a4c9d734f393b04
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/112037
Reviewed-by: Konstantin Shcheglov <[email protected]>
dart-bot pushed a commit to dart-lang/sdk that referenced this pull request Aug 7, 2019
Previously we only stored promoted variables.  Storing all variables
paves the way for expanding the `promoted` map into a map from
variable to VariableModel, which will be necessary to bring the
implementation in line with
dart-lang/language#473.

Change-Id: Ief7190b565da03e3c8dc22f8401f96f3e8a1d356
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/112040
Reviewed-by: Johnni Winther <[email protected]>
Reviewed-by: Konstantin Shcheglov <[email protected]>
Commit-Queue: Paul Berry <[email protected]>
that a variable may not be both definitely assigned and definitely
unassigned).

- `writeCaptured` is a boolean value indicating whether a closure might exist at
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm starting to implement the logic for writeCaptured in the prototype, and I wanted to let you know about two corner cases I've run into. First, what if a write happens in one closure and a read happens in another:

f(Object x) {
  void Function() h;
  var g = () {
    if (x is int) { // (1)
      h();
      x.isEven; // (2)
    }
  };
  h = () { x = "hello"; }; // (3)
  g();
}

Even though the if-test at (1) appears lexically before the closure that captures a write to x is created (at (3)), since (1) is itself inside a closure, there's no guarantee that it will execute prior to the write being captured. So promotion is unsafe in this case, and (2) should be a compile-time error. I'm handling this situation by saying that prior to analysis, we gather a list of all variables that are written to in all closures. Then, when entering any closure, analysis sets writeCaptured to true for any variables in that list.

Second, what if a write capturing closure happens in a loop, after a promotion attempt?

f(Object x) {
  void Function()? g;
  while (true) {
    if (x is int) { // (1)
      if (g != null) g();
      x.isEven; // (2)
    }
    g = () { x = 'hello'; };
  }
}

In this case the promotion at (1) is unsafe, because the second time through the loop, there is a write captured. So there should be a compile-time error at (2). I'm handling this situation by saying that when entering a loop, we set writeCaptured to true for any variables that are captured ahywhere within the loop.

stereotype441 added a commit to stereotype441/sdk that referenced this pull request Oct 5, 2019
Previously we only stored promoted variables.  Storing all variables
paves the way for expanding the `promoted` map into a map from
variable to variable state, which will be necessary to bring the
implementation in line with
dart-lang/language#473.
dart-bot pushed a commit to dart-lang/sdk that referenced this pull request Oct 9, 2019
Rather than suppress all promotions for variables written to in
closures, we track whether a closure might exist that captures a write
to a variable, and only suppress promotions once such a closure
exists.  This is consistent with what is proposed in
dart-lang/language#473.

Contains a repro of #38791.

Change-Id: I4683581908703b510a3231b8fa8ed697121b09da
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/120680
Reviewed-by: Konstantin Shcheglov <[email protected]>
@leafpetersen leafpetersen added the nnbd NNBD related issues label Feb 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes nnbd NNBD related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants