3 issues:
1) RETURN_NULL() was used via the macro NODE_GET_OBJ(), but the function
returns false on failure and cannot return null according to its
stub.
2) The struct layout of the different implementors of libxml only
guarantees overlap between the node pointer and the document
reference, so accessing the std zend_object may not work.
3) DOC_GET_OBJ() wasn't using ZSTR_VAL().
Closes GH-16307.
We also add zend_map_ptr_static, so that we do not incur the overhead of constantly recreating the internal run_time_cache pointers on each request.
This mechanism might be extended for mutable_data of internal classes too.
* Include from build dir first
This fixes out of tree builds by ensuring that configure artifacts are included
from the build dir.
Before, out of tree builds would preferably include files from the src dir, as
the include path was defined as follows (ignoring includes from ext/ and sapi/) :
-I$(top_builddir)/main
-I$(top_srcdir)
-I$(top_builddir)/TSRM
-I$(top_builddir)/Zend
-I$(top_srcdir)/main
-I$(top_srcdir)/Zend
-I$(top_srcdir)/TSRM
-I$(top_builddir)/
As a result, an out of tree build would include configure artifacts such as
`main/php_config.h` from the src dir.
After this change, the include path is defined as follows:
-I$(top_builddir)/main
-I$(top_builddir)
-I$(top_srcdir)/main
-I$(top_srcdir)
-I$(top_builddir)/TSRM
-I$(top_builddir)/Zend
-I$(top_srcdir)/Zend
-I$(top_srcdir)/TSRM
* Fix extension include path for out of tree builds
* Include config.h with the brackets form
`#include "config.h"` searches in the directory containing the including-file
before any other include path. This can include the wrong config.h when building
out of tree and a config.h exists in the source tree.
Using `#include <config.h>` uses exclusively the include path, and gives
priority to the build dir.
As XMLReader only exposes a single class, and the property handlers are
statically set, we don't need to store the pointer to the property
handler table inside the object.
This simplifies the code and reduces the memory required for XMLReader.
Fixes GHSA-3qrf-m4j2-pcrr.
To parse a document with libxml2, you first need to create a parsing context.
The parsing context contains parsing options (e.g. XML_NOENT to substitute
entities) that the application (in this case PHP) can set.
Unfortunately, libxml2 also supports providing default set options.
For example, if you call xmlSubstituteEntitiesDefault(1) then the XML_NOENT
option will be added to the parsing options every time you create a parsing
context **even if the application never requested XML_NOENT**.
Third party extensions can override these globals, in particular the
substitute entity global. This causes entity substitution to be
unexpectedly active.
Fix it by setting the parsing options to a sane known value.
For API calls that depend on global state we introduce
PHP_LIBXML_SANITIZE_GLOBALS() and PHP_LIBXML_RESTORE_GLOBALS().
For other APIs that work directly with a context we introduce
php_libxml_sanitize_parse_ctxt_options().
Fix it by extending the array sizes by one character. As the input is
limited to the maximum path length, there will always be place to append
the slash. As the php_check_specific_open_basedir() simply uses the
strings to compare against each other, no new failures related to too
long paths are introduced.
We'll let the DOM and XML case handle a potentially too long path in the
library code.
This reverts commit 94ee4f9834.
The commit was a bit too late to be included in PHP 8.2 RC1. Given it's a massive ABI break, we decide to postpone the change to PHP 8.3.
The current error message is incorrect -- the problem here is not
that the property is invalid, but that these methods are unusable
prior to loading data, same as read().