question

Upvotes
Accepted
2 2 0 3

New ReadTimeout error when using rdp.get_data() on code that has worked without issue for last 2 months

The same code has been working without issue for the last two months but started throwing the following error over the last two days. Any ideas why this might be the case and how to fix it? I tried contacting support and they suggested it might be because of a workspace outage that occurred yesterday. However, I have checked the Refinitiv service alerts page and this problem seems to be fixed but I am still getting the same error. For context I am using rdp.get_data to pull data on ~100 TR fields for ~2,000 RICs.


screenshot-2021-08-03-at-104337.png

rdp-apirefinitiv-data-platform
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvote
Accepted
39.2k 75 11 27

@oliver.rogers19

Your search expression returns ~2.4K hits. Changing the value of MAX_RESULTS from 10K to 5K in the code snippet you shared couldn't have made any difference, as the size of the result set is way below both values.
The error message you saw comes from the datagrid endpoint on RDP, which corresponds to rdp.get_data call. In other words it's rdp.get_data that failed, not rdp.search.
I'm not sure how you structured your data retrieval using rdp.get_data. If you retrieve the entire dataset of 100 data items for 2.4K RICs in a single rdp.get_data call, I'd say this is quite a massive request. I wouldn't be surprised if it times out occasionally. I would recommend breaking up the list of RICs into chunks of a few hundred each and using a separate rdp.get_data call for each chunk in a loop. While there's no hardcoded limit on either the number of RICs or the number of data items in the request, the bigger the request the likelier it is to fail. I suggest you try your data retrieval in chunks of a couple of hundred RICs.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

@Alex Putkov. Makes sense. Have split it up into batches of 200 RICs and it is working like a dream. Thank you for once again solving my issues :)
Upvotes
1.4k 6 2 2

Hi @oliver.rogers19,


would you mind sharing with us your code, with example instruments and fields (among your list of ~100 TR fields for ~2,000 RICs)?
I am asking to check if the method used to retrieve data from this endpoint is appropriate since it might have changed.


Would you mind also copying and pasting the whole error message in a comment below? I noticed that the screenshot is clipped at the bottom.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Hi @jonathan.legrand, I have found a fix for this problem. I was using rdp.search() to retrieve a list of RICs of companies in a specific TRBC business sector and then using this list of RICs in a subsequent rdp.get_data() call. The code below threw the timeout error. I found that changing the value of MAX_RESULTS from 10000 to 5000 meant that the code worked again. Seems like it might be a rate limit error? But then again I was not getting the 'Error code 429 - Too Many Requests'. So still slightly confused why it was working for two months and then suddenly started throwing errors?

MIN_REVENUE = 100000000
MAX_RESULTS = 10000
def get_RICs():
    RICs = rdp.search(
       view = rdp.SearchViews.Organisations, filter = f"RCSTRBC2012Name eq 'Basic materials\\Mineral resources' and Revenue ge {
MIN_REVENUE}",
      select = "PrimaryRIC", top = MAX_RESULTS)
 RICs.columns=['RIC']
 return RICs
Click below to post an Idea Post Idea