The $Id$ keywords were used in Subversion where they can be substituted
with filename, last revision number change, last changed date, and last
user who changed it.
In Git this functionality is different and can be done with Git attribute
ident. These need to be defined manually for each file in the
.gitattributes file and are afterwards replaced with 40-character
hexadecimal blob object name which is based only on the particular file
contents.
This patch simplifies handling of $Id$ keywords by removing them since
they are not used anymore.
RFC: https://wiki.php.net/rfc/flexible_heredoc_nowdoc_syntaxes
* The ending label no longer has to be followed by a semicolon or
newline. Any non-label character is fine.
* The ending label may be indented. The indentation will be stripped
from all lines in the heredoc/nowdoc string.
Lexing of heredoc strings performs a scan-ahead to determine the
indentation of the ending label, so that the correct amount of
indentation can be removed when calculting the semantic values for
use by the parser. This makes the implementation quite a bit more
complicated than we would like :/
Introduce an on_event_context passed to the on_event hook. Use this
context to pass along the token array. Previously this was stored
in a non-tls global :/
we basically added a mechanism to store the token stream during parsing
and exposed the entire parser stack on the tokenizer extension through
an opt in flag: token_get_all($src, TOKEN_PARSE).
this change allows easy future language enhancements regarding context
aware parsing & scanning without further maintance on the tokenizer
extension while solves known inconsistencies "parseless" tokenizer
extension has when it handles `__halt_compiler()` presence.
This fixes bug #60097.
Before two global variables CG(heredoc) and CG(heredoc_len) were used to
track the current heredoc label. In order to support nested heredoc
strings the *previous* heredoc label was assigned as the token value of
T_START_HEREDOC and the language_parser.y assigned that to CG(heredoc).
This created a dependency of the lexer on the parser. Thus the
token_get_all() function, which accesses the lexer directly without
also running the parser, was not able to tokenize nested heredoc strings
(and leaked memory). Same applies for the source-code highlighting
functions.
The new approach is to maintain a heredoc_label_stack in the lexer, which
contains all active heredoc labels.
As it is no longer required, T_START_HEREDOC and T_END_HEREDOC now don't
carry a token value anymore.
In order to make the work with zend_ptr_stack in this context more
convenient I added a new function zend_ptr_stack_top(), which retrieves the
top element of the stack (similar to zend_stack_top()).
this enables ZE2 to gracefully parse scripts written in UTF-8 (with BOM),
UTF-16, UTF-32, Shift_JIS, ISO-2022-JP etc... (when configured with
'--enable-zend-multibyte' and '--enable-mbstring')