-
Notifications
You must be signed in to change notification settings - Fork 132
Description
So I'm trying to write a wrapper over this lib that would be used as a generator. Basically I want to yield all json-objects I need from a stream.
The lib does not use yields or generators, so this is what I came up with:
First, a simple listener for this example: lets collect all values from "foo" fields given they all are scalar
class FooJsonListener extends \JsonStreamingParser\Listener\IdleListener {
protected $isReadingFoo = false;
protected $onFooCallback;
public function onFoo(callable $onFooCallback)
{
$this->onFooCallback = $onFooCallback;
}
public function key(string $key): void
{
$this->isReadingFoo = ($key === 'foo');
}
public function value($value): void
{
if ($this->isReadingFoo) {
call_user_func($this->onFooCallback, $value);
$this->isReadingFoo = false;
}
}
}
Now this function starts parser, then stops every time we collected a foo-value, yields it, then resumes parsing again:
function fooIterator($stream) {
$jsonListener = new FooJsonListener();
$jsonParser = new JsonStreamingParser\Parser($stream, $jsonListener);
$lastFoo = null;
$shouldYieldFoo = false;
$jsonListener->onFoo(function ($foo) use (&$lastFoo, &$shouldYieldFoo, $jsonParser) {
$lastFoo = $foo;
$shouldYieldFoo = true;
$jsonParser->stop();
});
while (true) {
$jsonParser->parse();
if ($shouldYieldFoo) {
yield $lastFoo;
$shouldYieldFoo = false;
} else {
break;
}
}
}
Looks awful, but should work. But it does not. After the first yield, next parse() will throw:
JsonStreamingParser\Exception\ParsingException: "Parsing error in [1:1]. Expected ',' or ']' while parsing array. Got: {"
It works if we don't interrupt it with "stop()" thought.
The problem:
parse() reads stream by chunks, but parses char-by-char. On any char there might be a call to listener to register parsed token. After every char it checks stopParsing flag, raised by stop().
So most of the cases, when you call stop() from listener you break chunk parsing in the middle, leftover of the chunk is discarded, and calling parse() then will proceed from the next chunk, not where it stopped.
And also stopParsing is never reset to false.
Possible fixes:
a. Store the actual count of bytes read on stop(), and do fseek to this offset on next parse() (would not work with non-seekable streams thought)
b. Better: store the leftover of chunk on stop() and prepend it to data read on next parse()
Sadly, neither of this is possible to implement by extending classes due to the privates.