TIDY_APPLY_CONFIG can early return because it's a macro, but then the
cleanup paths are not executed. Transform this to a real function and
handle the cleanups correctly at the callsites.
Closes GH-15046.
When dealing with a file, we must free the contents if the function
fails. While here, also fix the error message because previously it
sounded like the filename was too long while in fact the file itself
is too large.
Closes GH-14862.
On my system, with Tidy 5.7.45, I get the following error diff for two
tests:
002+ line 1 column 7 - Error: <asd> is not recognised!
002- line 1 column 7 - Error: <asd> is not recognized!
As we can see, the spelling of recognised is different. Use an EXPECTF
and %c to mitigate this issue.
Signed-off-by: George Peter Banyard <girgias@php.net>
Parse errors were not reported for the default config, they were only
reported when explicitly another config was loaded.
This means that users may not be aware of errors in their configuration
and therefore the behaviour of Tidy might not be what they intended.
This patch fixes that issue by using a common function. In fact, the
check for -1 might be enough for the current implementation of Tidy, but
the Tidy docs say that any value other than 0 indicates an error.
So future errors might not be caught when just using an error code of -1.
Therefore, this also changes the error code checks of == -1 to < 0 and
== 1 to > 0.
Closes GH-10636
Signed-off-by: George Peter Banyard <girgias@php.net>
We must not instantiate the object prior checking error conditions
Moreover, we need to release the HUGE amount of memory for files which are over 4GB when throwing a ValueError
Closes GH-10545
For rationale, see #6787
Extensions migrated in part 4:
* simplexml
* skeleton
* soap
* spl
* sqlite3
* sysvmsg
* sysvsem
* tidy - also removed a check for an ancient dependency version
"Uninitialized" here means that the object was created ordinarily
-- no constructor skipping involved. Most tidy methods seem to
handle this fine, but these three need to be guarded.
The documentation of `tidyNode::isHtml()` states that this method
"checks if a node is part of a HTML document". That is, of course,
nonsense, since a tidyNode is "an HTML node in an HTML file, as
detected by tidy."
What this method is actually supposed to do is to check whether a node
is an element (unless it is the root element). This has been broken by
commit d8eeb8e[1], which assumed that `enum TidyNodeType` would
represent flags of a bitmask, what it does not.
[1] <http://git.php.net/?p=php-src.git;a=commit;h=d8eeb8e28673236bca3f066ded75037a5bdf6378>
Closes GH-6290.
This patch adds missing newlines, trims multiple redundant final
newlines into a single one, and trims redundant leading newlines in all
*.phpt sections.
According to POSIX, a line is a sequence of zero or more non-' <newline>'
characters plus a terminating '<newline>' character. [1] Files should
normally have at least one final newline character.
C89 [2] and later standards [3] mention a final newline:
"A source file that is not empty shall end in a new-line character,
which shall not be immediately preceded by a backslash character."
Although it is not mandatory for all files to have a final newline
fixed, a more consistent and homogeneous approach brings less of commit
differences issues and a better development experience in certain text
editors and IDEs.
[1] http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_206
[2] https://port70.net/~nsz/c/c89/c89-draft.html#2.1.1.2
[3] https://port70.net/~nsz/c/c99/n1256.html#5.1.1.2
This patch adds missing newlines, trims multiple redundant final
newlines into a single one, and trims redundant leading newlines in all
*.phpt sections.
According to POSIX, a line is a sequence of zero or more non-' <newline>'
characters plus a terminating '<newline>' character. [1] Files should
normally have at least one final newline character.
C89 [2] and later standards [3] mention a final newline:
"A source file that is not empty shall end in a new-line character,
which shall not be immediately preceded by a backslash character."
Although it is not mandatory for all files to have a final newline
fixed, a more consistent and homogeneous approach brings less of commit
differences issues and a better development experience in certain text
editors and IDEs.
[1] http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_206
[2] https://port70.net/~nsz/c/c89/c89-draft.html#2.1.1.2
[3] https://port70.net/~nsz/c/c99/n1256.html#5.1.1.2
libtidy 5.6.0 remove the language option from the library, it is only
supported on cli. Prior to that, this option was not used in the
library. Thus, exclude the option presence from test.
Our existing test 024.phpt actually tests incorrect behavior. There is
a self-closing tag present in the input, but the expected output has
that same tag half-open (i.e. open but never closed). To support
tidy-html5, which does the right thing, that test needed to be
changed. The self-closing tag was replaced by an explicit pair of
tags, and some extra whitespace fudging was done.
One of the tests for tidy (016.phpt) is testing that we can use a
configuration file (016.tcfg) instead of a string to configure
tidy. It was observing the output of an API call, which proved too
fragile now that we support tidy-html5 as well. Instead, the test was
updated to inspect $tidy->getConfig() to ensure that the config file
was actually processed and will be respected.
One of the tidy tests expects some output that has (harmlessly)
changed in tidy-html5. The "EXPECT" block for that test was changed to
"EXPECTF" and mangled to accept both the old and new outputs.
Some of the tidy tests expect output that can change. The motivating
example is an object "id" that is some integer, but no integer in
particular. Those hard-coded values have been changed to accept any
integer so that the test suite passes when tidy-html5 is used.