Problem with retrieving all paginated REST API responses

Hi,

I´m conducting a research project where I am looking to extract the metadata for all books that have a license and full text information stored, there should are around 328 776 items that match this query based on this request to the API:

filter=type:book,type:monograph,has-license:true,has-full-text:true&cursor=*&rows=500

After using cursor-based retrieval the responses usually stop 1/3 into the extraction where I can get around 60 000 records but then responses suddenly start being only “SyntaxError: JSON.parse: unexpected character at line 1 column 1 of the JSON data”

There is something causing a problem with generating JSON responses for all items that match this query but I have no idea what it could be, anyone with any ideas?

Thankful for any help!

Mikael

Hi @laakso ,

Thanks for your message. I have only ever since that error when a metadata record is missing from that pool of the API. If you’re comfortable adding a mailto parameter to your query, you could try this in the Polite pool to see if you can replicate there: https://0-api-crossref-org.library.alliant.edu/works?filter=type:book,type:monograph,has-license:true,has-full-text:true&cursor=*&rows=500&mailto=support@crossref.org

Let me know if that solves this issue.

-Isaac

Hi Isaac,

Thank you for the quick reply! Unfortunately I have tried with adding my e-mail and spacing out the queries over time but unfortunately always run into the same problem of it giving me this error around the halfway point of running the queries. Do you have any other things I could try to solve or work around this to get at least most of the matches extracted?

Best regards,

Mikael