Skip to content

Ability to know how many characters were parsed #8

@david-andrew

Description

@david-andrew

I'm wondering if it would be possible to add an attribute to the result for how many characters were parsed. My usecase has me parsing a string input that has random text, interspersed with multiple json objects.So I want to do something like this:

from dirtyjson import loads

text = 'example text containing {"foo":0, "bar":1} multiple json objects {"bazz":2, "boo":3} possibly separated by random text [1,2,4,7] and other junk'

while len(text) > 0:
    #skip text ahead to next object/array
    if not text.startswith('{') and not text.startswith('['):
        index = next((i for i, c in enumerate(text) if c in ('{', '[')), -1)
        if i == -1:
            break #no more objects to eat
        text = text[index:]
    
    #parse the current object
    chunk = loads(text)

    #do something with the json object
    print(chunk)

    #strip the object out of the string
    characters_eaten = #somehow get the number of characters used for the parse
    text = text[characters_eaten:]

But right now it's really not feasible to do this because there's not way to measure how many characters were eaten while parsing the current object. I guess technically it would be possible to use the row/column annotation of the last element in the object/list and then find the closing delimiter, but that's super cumbersome. Having the length of the characters eaten would be very useful

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions