Skip to content

Parsing into Map #49

@piotrrzysko

Description

@piotrrzysko

Introduction

Sometimes users want to parse a JSON object into a map.

Let's assume that we have the following example object:

{
    "intKey": 123,
    "objKey": {
	"key1": "abc",
        "key2": false
    },
    "arrayKey": [1, 2, 3]
}

We expect the parser to produce a Map<String, Object> from which we should be able to extract the object's fields in the following way:

Map<String, Object> map = parser.parse(bytes, bytes.length, Map.class);

int intValue = (int) map.get("intKey");

Map<String, Object> obj = (Map<String, Object>) map.get("objKey");
String value1 = (String) obj.get("key1");
boolean value2 = (boolean) obj.get("key2");

List<Object> array = (List<Object>) map.get("arrayKey");

Question

Let’s assume that the parser exposes an API like:

Map<String, Object> map = parser.parse(bytes, bytes.length, Map.class);

The returned map is immutable.

JSON parsing benchmarks often show that, in Java, creating new strings takes a significant portion of the time. So, the question is: at which stage should this happen? I see two options:

Option 1

Map<String, Object> map = parser.parse(bytes, bytes.length, Map.class);

// at this point all Strings are created

String value1 = map.get("key"); // this doesn’t create a new one
String value2 = map.get("key"); // this doesn’t create a new one either

Option 2

Map<String, Object> map = parser.parse(bytes, bytes.length, Map.class);

// at this point, the map only holds its own copy of a byte array with all parsed strings, but no instance of String has been created so far

String value1 = map.get("key"); // this creates a new instance of String 
String value2 = map.get("key"); // this also creates a new instance of String 

I suppose the second option is far more efficient in situations where someone wants to access only a small set of all fields and they want to do so only once.

@ZhaiMo15 @zekronium since you reported this topic, what are your thoughts? I’d like to understand your use cases better to be able to choose a more suitable option or come up with something else.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions