nom parser instead of ad-hoc in examples#20122
nom parser instead of ad-hoc in examples#20122cj-zhukov wants to merge 5 commits intoapache:mainfrom
Conversation
High-Level OverviewThis PR is an exploratory step to evaluate whether using a parser combinator library ( In a previous PR #19750, the parsing of subcommands and example metadata in
Personally, I found the That said, this PR is intentionally limited in scope. If the project sees value in adopting Feedback on whether this trade-off is worthwhile is very welcome. |
|
I'd like to keep the parser simple for now. Currently, it can't handle extra symbols like () in the description of an example. In practice, only one group I'm happy to improve the parser in the future to handle such cases more robustly if needed. For now, this keeps the code readable and avoids unnecessary complexity. |
|
@Jefffrey since you helped with previous PRs related to example docs generation, it would be great if you could take a look at this one as well. Your feedback or any improvements would be much appreciated. |
comphead
left a comment
There was a problem hiding this comment.
Thanks @cj-zhukov I have some feeling the examples_docs can be renamed to reflect it is a parser for examples, perhaps it would be good to split into smaller utility files as well. But this can be addressed in the following PR.
One thing to note is: nom not very actively developing
datafusion-examples/README.md
Outdated
| | udwf | [`udf/simple_udwf.rs`](examples/udf/simple_udwf.rs) | Simple UDWF example | | ||
| | Subcommand | File Path | Description | | ||
| | ---------- | ------------------------------------------------------- | --------------------------------------------- | | ||
| | adv_udaf | [`udf/advanced_udaf.rs`](examples/udf/advanced_udaf.rs) | Advanced User Defined Aggregate Function UDAF | |
There was a problem hiding this comment.
nit: I think the braces were nice since it was just highlighting the abbreviation
There was a problem hiding this comment.
Good point - I agree that supporting extra symbols like parentheses in the description would be better.
I’ll update the parser in this PR to handle that case more robustly.
xudong963
left a comment
There was a problem hiding this comment.
Happy to see we use nom to do the work. (I had some good experiences using it to build sql parser years ago)
IMO, It has become relatively mature. |
Thanks for the feedback! I agree that To keep this PR scoped and easy to review, I’d prefer to address renaming and refactoring in a follow-up PR. I’ll open one specifically for improving naming and structure. Thanks also for the note about |
Which issue does this PR close?
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?