The performance of DOM Parser and Schema-Based Parser.

I've been testing the performance of Simdjson recently. The basic test is similar to default test, using twitter.json, as below:
```java
@Benchmark
    public int recordSimdjson() {
        Set<String> defaultUsers = new HashSet<>();
        TwitterRecord twitter = simdJsonParser.parse(buffer, buffer.length, TwitterRecord.class);
        for (StatusRecord status : twitter.statuses()) {
            UserRecord user = status.user();
            if (user.default_profile()) {
                defaultUsers.add(user.screen_name());
            }
        }
        return defaultUsers.size();
    }

    @Benchmark
    public int JsonValueSimdjson() {
        JsonValue simdJsonValue = simdJsonParser.parse(buffer, buffer.length);
        Set<String> defaultUsers = new HashSet<>();
        Iterator<JsonValue> tweets = simdJsonValue.get("statuses").arrayIterator();
        while (tweets.hasNext()) {
            JsonValue tweet = tweets.next();
            JsonValue user = tweet.get("user");
            if (user.get("default_profile").asBoolean()) {
                defaultUsers.add(user.get("screen_name").asString());
            }
        }
        return defaultUsers.size();
    }

    @Benchmark
    public int recordJackson() throws IOException {
        Set<String> defaultUsers = new HashSet<>();
        TwitterRecord twitter = objectMapper.readValue(buffer, TwitterRecord.class);
        for (StatusRecord status : twitter.statuses()) {
            UserRecord user = status.user();
            if (user.default_profile()) {
                defaultUsers.add(user.screen_name());
            }
        }
        return defaultUsers.size();
    }

    record UserRecord(boolean default_profile, String screen_name) {
    }

    record StatusRecord(UserRecord user) {
    }

    record TwitterRecord(List<StatusRecord> statuses) {
    }
```

What's different is I shrunk the size of statuses, default is 101, I tested 101, 51, and 1 respectively, the result is below:
size 101:
<img width="1206" alt="image" src="https://github.com/simdjson/simdjson-java/assets/35990759/0dea0ee5-84ac-43a4-ad1a-c52f11b27f2f">

size 51:
<img width="1196" alt="image" src="https://github.com/simdjson/simdjson-java/assets/35990759/62ec2998-108f-42ef-b7ea-f87bed6a4756">

size 1:
<img width="1206" alt="image" src="https://github.com/simdjson/simdjson-java/assets/35990759/ca0ae43f-4d51-4661-bcab-a34e80148fb1">


What's more, I changed the depth of test, the default is 3 and I changed it to 2, as below:
```java
@Benchmark
    public int recordSimdjson() {
        Set<Object> defaultUsers = new HashSet<>();
        TwitterRecord twitter = simdJsonParser.parse(buffer, buffer.length, TwitterRecord.class);
        for (StatusRecord status : twitter.statuses()) {
            long id = status.id();
            String text = status.text();
            defaultUsers.add(id);
            defaultUsers.add(text);
        }
        return defaultUsers.size();
    }

    @Benchmark
    public int JsonValueSimdjson() {
        JsonValue simdJsonValue = simdJsonParser.parse(buffer, buffer.length);
        Set<Object> defaultUsers = new HashSet<>();
        Iterator<JsonValue> tweets = simdJsonValue.get("statuses").arrayIterator();
        while (tweets.hasNext()) {
            JsonValue tweet = tweets.next();
            JsonValue id = tweet.get("id");
            JsonValue text = tweet.get("text");
            defaultUsers.add(id.asLong());
            defaultUsers.add(text.asString());
        }
        return defaultUsers.size();
    }

    @Benchmark
    public int recordJackson() throws IOException {
        Set<Object> defaultUsers = new HashSet<>();
        TwitterRecord twitter = objectMapper.readValue(buffer, TwitterRecord.class);
        for (StatusRecord status : twitter.statuses()) {
            long id = status.id();
            String text = status.text();
            defaultUsers.add(id);
            defaultUsers.add(text);
        }
        return defaultUsers.size();
    }

    record StatusRecord(long id, String text) {
    }

    record TwitterRecord(List<StatusRecord> statuses) {
    }
```

The results are:
size 101:
<img width="1208" alt="image" src="https://github.com/simdjson/simdjson-java/assets/35990759/56a61939-9842-46d9-8fb5-4601d88dd49d">

size 51:
<img width="1208" alt="image" src="https://github.com/simdjson/simdjson-java/assets/35990759/d1e4d151-74a5-4d5e-8841-9070f59833c2">

size 1:
<img width="1205" alt="image" src="https://github.com/simdjson/simdjson-java/assets/35990759/1dac2f69-6fec-4861-921c-139947a7acdc">


Here are my questions:
1. The performance of Simdjson is not always faster than jackson? The shorter the JSON, the worse of Simdjson? If my JSON is short, I'd better not use simdjson?
2. DOM Parser vs Schema-Based Parser, the performance also depends on size of JSON? My first thought is Schema-Based is faster.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

The performance of DOM Parser and Schema-Based Parser. #52

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

The performance of DOM Parser and Schema-Based Parser. #52

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions