Skip to content

TDB2 fails to roundtrip non-canonical decimals. #3404

@Aklakan

Description

@Aklakan

Version

5.6.0-SNAPSHOT

What happened?

Update: The issue turned out specific to decimals: Decimals were always returned in canonical lexical form even when the internal representation mapped to a different lexical form because BigDecimal.scale was not canonicalized.

With TDB2, numeric literals seem to be exposed in their canonical form, which may not match the stored lexical form. As a result, checking for containment of triples that were previously returned may unexpectedly fail. In my case, this behavior breaks a (custom) sameAs reasoner. It seems that the lexical form is still available in TDB2 but it would have to be exposed.
Note, that the incorrect canonical value already appears at the Node level - it does not seem to be a mere artifact of the result set serialization. To me this seems to be a bug in TDB2. The behavior is as expected for the in memory dataset.

    @Test
    public void testNumeric() {
        System.out.println("Memory:");
        doEval(DatasetGraphFactory.create());
        System.out.println();
        System.out.println("TDB2:");
        doEval(TDB2Factory.createDataset().asDatasetGraph());
    }

    private static void doEval(DatasetGraph dsg) {
        try (AutoTxn txn = Txn.autoTxn(dsg, ReadWrite.WRITE)) {
            UpdateExec.dataset(dsg).update("INSERT DATA { <urn:s> <urn:p> '18'^^<http://www.w3.org/2001/XMLSchema#decimal> }").execute();

            Table table = QueryExec.dataset(dsg).query("SELECT * { ?s ?p ?o }").table();
            Triple t = Substitute.substitute(Triple.create(Var.alloc("s"), Var.alloc("p"), Var.alloc("o")), table.rows().next());
            System.out.println("Obtained triple via SELECT: " + t);

            Graph rg = QueryExec.dataset(dsg).query("CONSTRUCT WHERE { ?s ?p ?o }").construct();
            System.out.println("Obtained graph via CONSTRUCT:");
            RDFDataMgr.write(System.out, rg, RDFFormat.NTRIPLES);

            boolean b = QueryExec.dataset(dsg).query("ASK {" + NodeFmtLib.strNT(t) + " }").ask();
            System.out.println("Sparql Ask: " + b);
            Iterator<Triple> it = dsg.getDefaultGraph().find(t);
            try {
                System.out.println("Graph.find: " + it.hasNext());
            } finally {
                Iter.close(it);
            }
            txn.commit();
        }
    }

Output:

Memory:
Obtained triple via SELECT: urn:s urn:p "18"^^xsd:decimal
Obtained graph via CONSTRUCT:
<urn:s> <urn:p> "18"^^<http://www.w3.org/2001/XMLSchema#decimal> .
Sparql Ask: true
Graph.find: true

TDB2:
Obtained triple via SELECT: urn:s urn:p "18.0"^^xsd:decimal
Obtained graph via CONSTRUCT:
<urn:s> <urn:p> "18.0"^^<http://www.w3.org/2001/XMLSchema#decimal> .
Sparql Ask: false
Graph.find: false

Note, that when using "18"^^xsd:decimal then the lookups return true for TDB2.

Relevant output and stacktrace

Are you interested in making a pull request?

Maybe

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions