I would need to build a Tree that would retrieve something like this using Lark package:
start
expr
or_expr
and_expr
comp_expr
identifier Name
comparator eq
value 'Milk'
comp_expr
identifier Price
comparator lt
value 2.55
The grammar used is the following
from lark import Lark
odata_grammar = """
start: expr
expr: or_expr
or_expr: and_expr ("or" and_expr)*
and_expr: comp_expr ("and" comp_expr)*
comp_expr: identifier comparator value -> comp_expr
comparator: "eq" | "lt" | "gt" | "le" | "ge" | "ne"
value: STRING | NUMBER
identifier: CNAME
STRING: /'(''|[^'])*'/
DATE: /\d{4}-\d{2}-\d{2}/
NUMBER: /-?\d+(\.\d+)?/
%import common.CNAME
%import common.WS
%ignore WS
"""
parser = Lark(odata_grammar, start='start', parser='lalr')
url_filter = "Name eq 'Milk' and Price lt 2.55"
tree = parser.parse(url_filter)
print(tree.pretty())
When I print this tree, I find that the Tree retrieved is the following:
start
expr
or_expr
and_expr
comp_expr
identifier Name
comparator
value 'Milk'
comp_expr
identifier Price
comparator
value 2.55
The comparator for some reason is not retrieved. And I say retrieved because the Lark package seems to detect it but it is not printed in the tree. This is curious because when I try to "force" the comparator to doing something like this in the grammar comparator: "eq" -> eq
what I get is the comparator named as eq
but not comparator: eq
.
I would need to build a Tree that would retrieve something like this using Lark package:
start
expr
or_expr
and_expr
comp_expr
identifier Name
comparator eq
value 'Milk'
comp_expr
identifier Price
comparator lt
value 2.55
The grammar used is the following
from lark import Lark
odata_grammar = """
start: expr
expr: or_expr
or_expr: and_expr ("or" and_expr)*
and_expr: comp_expr ("and" comp_expr)*
comp_expr: identifier comparator value -> comp_expr
comparator: "eq" | "lt" | "gt" | "le" | "ge" | "ne"
value: STRING | NUMBER
identifier: CNAME
STRING: /'(''|[^'])*'/
DATE: /\d{4}-\d{2}-\d{2}/
NUMBER: /-?\d+(\.\d+)?/
%import common.CNAME
%import common.WS
%ignore WS
"""
parser = Lark(odata_grammar, start='start', parser='lalr')
url_filter = "Name eq 'Milk' and Price lt 2.55"
tree = parser.parse(url_filter)
print(tree.pretty())
When I print this tree, I find that the Tree retrieved is the following:
start
expr
or_expr
and_expr
comp_expr
identifier Name
comparator
value 'Milk'
comp_expr
identifier Price
comparator
value 2.55
The comparator for some reason is not retrieved. And I say retrieved because the Lark package seems to detect it but it is not printed in the tree. This is curious because when I try to "force" the comparator to doing something like this in the grammar comparator: "eq" -> eq
what I get is the comparator named as eq
but not comparator: eq
.
1 Answer
Reset to default 1See Tree Construction section in Lark documentation: https://lark-parser.readthedocs.io/en/stable/tree_construction.html:
" Lark filters out certain types of terminals by default, considering them punctuation:
Terminals that won’t appear in the tree are:
Unnamed literals (like "keyword" or "+")
Terminals whose name starts with an underscore (like _DIGIT)
Terminals that will appear in the tree are:
Unnamed regular expressions (like /[0-9]/)
Named terminals whose name starts with a letter (like DIGIT) "
so... option one - transform the string literals of your comparator rule into regexps:
odata_grammar = """
start: expr
expr: or_expr
or_expr: and_expr ("or" and_expr)*
and_expr: comp_expr ("and" comp_expr)*
comp_expr: identifier comparator value -> comp_expr
comparator: /eq/ | /lt/ | /gt/ | /le/ | /ge/ | /ne/
value: STRING | NUMBER
identifier: CNAME
STRING: /'(''|[^'])*'/
DATE: /\d{4}-\d{2}-\d{2}/
NUMBER: /-?\d+(\.\d+)?/
%import common.CNAME
%import common.WS
%ignore WS
Option two: add rules for each comparator literal:
odata_grammar = """
start: expr
expr: or_expr
or_expr: and_expr ("or" and_expr)*
and_expr: comp_expr ("and" comp_expr)*
comp_expr: identifier comparator value -> comp_expr
comparator: eq | lt | gt | le | ge | ne
eq: "eq"
lt: "lt"
gt: "gt"
le: "le"
ge: "ge"
ne: "ne"
value: STRING | NUMBER
identifier: CNAME
STRING: /'(''|[^'])*'/
DATE: /\d{4}-\d{2}-\d{2}/
NUMBER: /-?\d+(\.\d+)?/
%import common.CNAME
%import common.WS
%ignore WS
"""
Both solutions will capture eq
into the the parse tree.