This is very ugly: Bison provides a yynerrs variable, which is
usually not actually used, but also not annotated with
YY_MAYBE_UNUSED. Suppress this warning by adding a (void)yynerrs
into the top-level reduction action. The alternative would be to
disable the warning for these generated files.
RFC: https://wiki.php.net/rfc/dnf_types
This allows to combine union and intersection types together in the following form (A&B)|(X&Y)|T but not of the form (X|A)&(Y|B) or (X|A)&(Y|B)|T.
* Improve union type parsing
Co-authored-by: Sara Golemon <pollita@php.net>
The output of token_get_all is unaffected, so projects such as the userland
nikic/php-parser are unaffected.
Extensions such as nikic/php-ast which expose the internal php ast would see
literals be flattened, though.
This makes php significantly faster at parsing code such as
`$x = eval('return ' . var_export(str_repeat("\0", 100), true) . ';');`
and avoids the stack overflow from recursing 100000 times in
zend_eval_const_expr to process `'' . "\0" . '' . "\0" . ...`
Closes GH-7946
* Don't create binary op if unnecessary
* Update Zend/zend_ast.c
Co-authored-by: Nikita Popov <nikita.ppv@googlemail.com>
Add support for readonly properties, for which only a single
initializing assignment from the declaring scope is allowed.
RFC: https://wiki.php.net/rfc/readonly_properties_v2
Closes GH-7089.
Support acquiring a Closure to a callable using the syntax
func(...), $obj->method(...), etc. This is essentially a
shortcut for Closure::fromCallable().
RFC: https://wiki.php.net/rfc/first_class_callable_syntax
Closes GH-7019.
Co-Authored-By: Nikita Popov <nikita.ppv@gmail.com>
Because php supports doc comments on class constants, I believe it would also
make sense to support them on enum cases.
I don't have strong opinions about whether attributes should be moved to be the
last element or whether the doc comment should go after the attribute,
but the ast will likely change again before php 8.1 is stable.
So far, all attributes are the last ast child node.
I didn't notice that doc comments weren't implemented due to
https://github.com/php/php-src/pull/6489 being a large change.
https://wiki.php.net/rfc/enumerations
did not mention whether or not doc comments were meant to be supported
This PR corrects misspellings identified by the check-spelling action.
The misspellings have been reported at jsoref@b6ba3e2#commitcomment-48946465
The action reports that the changes in this PR would make it happy: jsoref@602417c
Closes GH-6822.
From an engine perspective, named parameters mainly add three
concepts:
* The SEND_* opcodes now accept a CONST op2, which is the
argument name. For now, it is looked up by linear scan and
runtime cached.
* This may leave UNDEF arguments on the stack. To avoid having
to deal with them in other places, a CHECK_UNDEF_ARGS opcode
is used to either replace them with defaults, or error.
* For variadic functions, EX(extra_named_params) are collected
and need to be freed based on ZEND_CALL_HAS_EXTRA_NAMED_PARAMS.
RFC: https://wiki.php.net/rfc/named_params
Closes GH-5357.
RFC: https://wiki.php.net/rfc/trailing_comma_in_closure_use_list
Discussion: https://externals.io/message/110715
The release manager has agreed to allow merging of RFCs that have near-unanimous
votes. If an RFC ends up not achieving the required 2/3 majority at the time the
announced voting period closes, this implementation commit will be reverted
in time for the feature freeze.
Closes GH-5793
Currently, unexpected tokens in the parser are shown as the text
found, plus the internal token name, including the notorious
"unexpected '::' (T_PAAMAYIM_NEKUDOTAYIM)".
This commit replaces that with a more user-friendly format, with
two main types of token:
* Tokens which always represent the same text are shown like
'unexpected token "::"' and 'expected "::"'
* Tokens which have variable text are given a user-friendly
name, and show like 'unexpected identifier "foo"', and
'expected identifer'.
A few tokens have special cases:
* unexpected token """ -> unexpected double-quote mark
* unexpected quoted string "'foo'" -> unexpected single-quoted
string "foo"
* unexpected quoted string ""foo"" -> unexpected double-quoted
string "foo"
* unexpected illegal character "_" -> unexpected character 0xNN
(where _ is almost certainly a control character, and NN is the
hexadecimal value of the byte)
The \ token has a special case in the implementation just to stop
bison making a mess of escaping it and it coming out as \\
As part of my work on typed class constants, I wanted to make a
separate pull request for unrelated changes.
These include:
* Moving from ast->child[0]->attr to ast->attr
* Making zend_ast_export_ex() export class constants' visibility
* Extracting an additional zend_compile_class_const_group() function
Closes GH-5812.
One of the weirdest pieces of PHP code I've ever seen. In terms
of tokens, this gets internally translated to
use x as y; echo as my_echo;
On master it crashes because this "echo" does not have attached
identifier metadata. Make sure it is added and then reject the
use of "<?=" as an identifier inside zend_lex_tstring.
Fixes oss-fuzz #23547.
This is a bit tricky: In this cases we have "namespace as", which
means that we will only recognize "namespace" as an identifier when
the lookahead token is already at the "as". This means that
zend_lex_tstring picks up the wrong identifier.
We solve this by actually assigning the identifier as the semantic
value on the parser stack -- as in almost all cases we will not
actually need the identifier, this is just an (offset, size)
reference, not a copy of the string.
Additionally, we need to teach the lexer feedback mechanism used
by tokenizer TOKEN_PARSE mode to apply feedback to something
other than the very last token. To that purpose we pass through
the token text and check the tokens in reverse order to find the
right one.
Closes GH-5668.
RFC: https://wiki.php.net/rfc/throw_expression
This has an open issue with temporaries that are live at the time
of the throw being leaked. Landing this now for easier testing and
will revert if we cannot resolve the issue.
Closes GH-5279.
RFC: https://wiki.php.net/rfc/static_return_type
The "static" type is represented as MAY_BE_STATIC, rather than
a class type like "self" and "parent", as it has special
resolution semantics, and cannot be cached in the runtime cache.
Closes GH-5062.
Prefer '%define api.value.type' to '#define YYSTYPE', so that Bison
know the type.
Use '%code requires' to declare what is needed to define the api.value.type
(that code is output in the generated header before the generated
definition of YYSTYPE).
Prefer '%define api.prefix' inside the grammar file to '-p' outside,
as anyway the functions defined in the file actually use this prefix.
Prefer `%param` to both `%parse-param` and `%lex-param`.
Closes GH-5138
First, fix 5547d36120: the definition of
YFLAGS was not passed into the Makefile: AC_SUBST does not suffice, we
need PHP_SUBST_OLD. While at it, allow to pass variable and value at
the same time.
Then pass -Wall to bison, rather than only -Wempty-rules.
Use %precedence where associativity is useless.
Remove useless %precedence.
GH-5138
The annotation %empty is properly enforced: warnings when it's
missing, and errors when it's inappropriate. Support for %empty was
introduced in Bison 3.0.
Pass -Wempty-rule to Bison.
Closes GH-5134
The YYERROR_VERBOSE macro will no longer be supported in Bison 3.6.
It was superseded by the "%error-verbose" directive in Bison 1.875
(2003-01-01). Bison 2.6 (2012-07-19) clearly announced that support
for YYERROR_VERBOSE would be removed. Note that since Bison 3.0
(2013-07-25), "%error-verbose" is deprecated in favor of "%define
parse.error verbose".
Closes GH-5125.
Throw a compile error for "static" references instead, where it
isn't already the case.
Also extract the code that does that -- we have quite a few places
where we get a const class ref and require it to be default.
According to RFC: https://wiki.php.net/rfc/union_types_v2
The type representation now makes use of both the pointer payload
and the type mask at the same time. Additionall, zend_type_list is
introduced as a new kind of pointer payload, which is used to store
multiple class types. Each of the class types is a tagged pointer,
which may be either a class name or class entry. The latter is only
used for typed properties, while arguments/returns will instead use
cache slots. A type list can contain a mix of both names and CEs at
the same time, as not all classes may be resolvable.
One thing this is missing is support for union types in arginfo
and stubs, which I want to handle separately.
I've also dropped the special object code from the JIT implementation
for now -- I plan to add this back in a different form at a later time.
For now I did not want to include non-trivial JIT changes together
with large functional changes.
Another possible piece of follow-up work is to implement "iterable"
as an internal alias for "array|Traversable". I believe this will
eliminate quite a few special-cases that had to be implemented.
Closes GH-4838.