question

Upvotes
Accepted
31 1 1 5

Offsets in the response are not valid

I have parsed the json file and found out that it is quite hard to align the offset defined for the entity with its place in the raw text input. There are several reasons for it:

1. Each document has its own additional offset (metadata with hash and other info) which makes the initial offset number invalid.

2. Newlines and any symbols that do not get encoded properly (e.g., "company\u2019s") move the offset to the extent where the index we need cannot be restored.

Could you please help me figure out the simplest way to process offsets?

intelligent-tagging-apiintelligent-taggingopen-calais-apisemantic-metadata-taggingjsonparsing
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

1 Answer

Upvote
Accepted
1.2k 6 10 8

Proper encoding is important, here's a jsfiddle that might get you started on the right track:

https://jsfiddle.net/84255hgk/

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

@Tomasz Adamusiak, thank you for your reply. It lead me to the right direction in solving the problem (I hope so).

Click below to post an Idea Post Idea