-
Notifications
You must be signed in to change notification settings - Fork 1
Description
FR: support for "globs" which match any number of AST nodes inside of a sequence. For example, to normalize set literals, I should be able to write:
refex --mode=py.expr 'set([$x...])' --sub='{$x...}' -i file.py
A rough TODO list:
- glob expressions inside of container literals:
[$x...] - glob statements inside of suites:
$x... - glob chaining comparisons (may defer to later issue, not important):
a < $x... < z - glob function arguments
- decide on how to deal with keyword arguments: should
f($x...)include kwargs? - glob function definitions
- decide on how to deal with keyword arguments: should
- globbed "fragments", especially:
- with statements (globbed
withitem) - dicts (globbed k:v pairs)
- kwargs in function calls and definitions, irrespective of the default handling of
f($x...)
- with statements (globbed
- glob expression substitution
- glob sub-expression substitution: e.g. replace
[$x...]with[$($x + 1)...]to go from[1, 2]to[1 + 1, 2 + 1] - named-sub targeting "empty glob"
- safety:
- auto-comma add/remove as appropriate (... if necessary?)
- forbid globbed statement insertion into expr glob, and vice versa.
Glob chaining comparisons
This requires explicit handling, because an implementation would probably by default be very broken.
(In particular, if you naively glob any sequence in the AST, you'd glob only the list of objects being compared, and not the list of comparators. So you could write a matcher for a < $x... < z, but it would match only the same expressions as a < $y < z.)
At the least it should be explicitly rejected. At the most, it seems ok to allow it, but require that the comparison operator is the same across the whole thing (so e.g. a < $x... < z matches a < b < c < z, but not a < b <= c < z; a < $x... <= z would be an invalid pattern). Anything more sophisticated would still require a custom matcher.
Note: this is (one of?) the only expression globs that isn't comma-delimited, so it is very different from most contexts.
Glob function arguments
Something needs to be done here -- as above, due to the AST, the naive implementation would be wrong, and give totally different behavior for function calls and function definitions due to the differing ASTs. Either both should catch all parameters of all kinds, or neither should. It seems desirable to have $x... by default catch, like, everything between the parens, as that makes it easier to do things, but doesn't remove any power.
Globbed fragments
Aside from expressions and statements, many things one would reasonably want to glob are fragments that are only valid in exactly the same sort of context. k:v pairs in dicts and keyword args / definitions in function calls/definitions, for example.
Glob sub-expression substitution
Scheme/Rust macro-by-example is the way to go here :) Every layer of $(something)... "unwraps" one layer of globbing, so that inside, a reference to $x refers to individually matched things rather than the whole glob.
Some care will need to be taken if we want $x... and $(x)... to be identical in all cases, around e.g. comments between members.
Named-sub targeting "empty glob"
For a nonempty glob match (with the exception of some things like dict literals), the lexical span is just everything between and including the first and last match of the glob. This doesn't work for empty matches. In fact, nothing does.
asttokens doesn't have any way to find out what the token range is for the empty list of elts inside of []. Probably it needs to have a function which obtains the token range for a field, rather than just a node, and then this gets implemented one at a time for container literals, function calls, etc.
Will file bug report upstream and volunteer to put some minimal work into it (at least elts for container literals, since that's easy enough.)