-
Notifications
You must be signed in to change notification settings - Fork 44
Description
If I annotate the following text using DBpedia Spotlight Model, the results I get are very different, depending on the length of the text. If the text length is small the recognition is correct. For longer texts, the predictions are not correct.
For example, for the following sample the predictions are correct.
Qualifikationen ||| Branchenkenntnisse ||| Öffentlicher Sektor, Banken, Dienstleistung allgemein, Handel, Luft- und Raumfahrtindustrie , Telekommunikation, Kassenärzt liche Vereinigung, Energieversorger, Erstversicherung (GKV, SHU, K, L), Rückversicherung ||| Sprachkenntnisse ||| Deutsch , ||| Englisch ||| Technische Kenntnisse ||| Anwendungssoftware: ||| MS-Office, MS-Visio, MS-Access, MS-Project ||| Betriebssysteme : ||| Windows, Unix, Linux ||| Datenbanken: ||| Sybase, DB/2, MS-SQLServer, mySQL, Oracle, Cassandra, SAP DB, Informix, MS-Access ||| Programmiersprachen: ||| Java, Javascript, Groovy, Assembler, C, C++, C#, COBOL, Delphi, HTML, PHP, Visual Basic, PL/SQL
However, if the same sample is part of a 3-4 page document that the results are not consistent. For the word Deutsch in such a case I get something like :
<Resource URI="http://de.dbpedia.org/resource/Deutschland" support="172968" types="Wikidata:Q6256,Schema:Place,Schema:Country,DBpedia:PopulatedPlace,DBpedia:Place,DBpedia:Location,DBpedia:Country" surfaceForm="Deutsch" offset="2113" similarityScore="0.9999999998672138" percentageOfSecondRank="1.328211046569277E-10"/>
Why is there a difference in the results? Also, the Resource URI, doesn't respond when accessed separately. Any reason why? Why is there an inconsistency of results?