-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
I'm wondering if it would be possible to add an attribute to the result for how many characters were parsed. My usecase has me parsing a string input that has random text, interspersed with multiple json objects.So I want to do something like this:
from dirtyjson import loads
text = 'example text containing {"foo":0, "bar":1} multiple json objects {"bazz":2, "boo":3} possibly separated by random text [1,2,4,7] and other junk'
while len(text) > 0:
#skip text ahead to next object/array
if not text.startswith('{') and not text.startswith('['):
index = next((i for i, c in enumerate(text) if c in ('{', '[')), -1)
if i == -1:
break #no more objects to eat
text = text[index:]
#parse the current object
chunk = loads(text)
#do something with the json object
print(chunk)
#strip the object out of the string
characters_eaten = #somehow get the number of characters used for the parse
text = text[characters_eaten:]But right now it's really not feasible to do this because there's not way to measure how many characters were eaten while parsing the current object. I guess technically it would be possible to use the row/column annotation of the last element in the object/list and then find the closing delimiter, but that's super cumbersome. Having the length of the characters eaten would be very useful
Metadata
Metadata
Assignees
Labels
No labels