Add PDF/EPUB IA links to search.json providers for open access books#11731
Add PDF/EPUB IA links to search.json providers for open access books#11731cdrini merged 5 commits intointernetarchive:masterfrom
Conversation
…t key tracking in `copydocs.py`.
for more information, see https://pre-commit.ci
…ad file retrieval with MOBI support, and improve `copydocs` script's saved document handling.
There was a problem hiding this comment.
Pull request overview
This PR extends the Internet Archive provider so that OPDS results include direct download links (PDF/EPUB) for open-access IA books, aligning the API with the behavior shown in the linked issue and example payload.
Changes:
- Refactors
InternetArchiveProvider.get_acquisitionsto always create a primary “web” acquisition and, for open-access items, append additional acquisitions for direct PDF/EPUB downloads. - Introduces
_get_ia_download_files, which uses cachedia.get_metadata()to derive available IA file names and select appropriate.pdfand.epubassets. - Leaves the global
get_acquisitions(solr_edition, edition)dispatcher compatible while ensuring IA acquisitions now expose richer download options used downstream in OPDS responses.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Hey @cdrini @mekarpeles , any chance we can get this merged? I'd like to have it ready to show to Brewster on Thursday, and setup so it uses the PRs service. |
|
Hi @ronibhakta1 ! Apologies for the delay. This is making direct individual network requests for every search result, which is too much of a performance hit I'm afraid. Let's instead assume that for open access materials, pdf/epubs will always be present at the default paths, and let's see how that works as a heuristic. |
Sounds great to me. I will work on the changes and get it resolved by tomorrow for you to review this again. |
|
@cdrini I have made the changes required you are good to go! |
There was a problem hiding this comment.
(Ignore deleted CR comment ; I misread the code!)
Lgtm! Tested here: https://testing.openlibrary.org/search.json?q=goody%20two%20shoes&fields=key,title,ia,providers,editions,ebook_access
Closes ArchiveLabs/pyopds2_openlibrary#20
Refactor OPDs results to return Download links for IA books
Technical
Testing
{ "metadata": { "title": "Robinson Crusoe", "type": "http://schema.org/Book", "language": ["en"], "author": [ { "name": "Daniel Defoe", "links": [ { "href": "http://localhost:8080/authors/OL18283A", "type": "text/html", "rel": "author" } ] } ], "numberOfPages": 403 }, "links": [ { "href": "http://localhost:8080/opds/books/OL24152177M", "type": "application/opds-publication+json", "rel": "self" }, { "href": "http://localhost:8080/books/OL24152177M", "type": "text/html", "rel": "alternate" }, { "href": "http://localhost:8080/books/OL24152177M.json", "type": "application/json", "rel": "alternate" }, { "href": "https://archive.org/details/cu31924011498676?view=theater&wrapper=false", "type": "text/html", "rel": "http://opds-spec.org/acquisition/open-access", "properties": { "availability": "available", "more": { "href": "https://archive.org/services/loans/loan/?action=webpub&identifier=cu31924011498676&opds=1", "rel": "http://opds-spec.org/acquisition/", "type": "application/opds-publication+json" } } }, { "href": "https://archive.org/download/cu31924011498676/cu31924011498676.pdf", "type": "application/pdf", "rel": "http://opds-spec.org/acquisition/open-access", "properties": { "availability": "available", "more": { "href": "https://archive.org/services/loans/loan/?action=webpub&identifier=cu31924011498676&opds=1", "rel": "http://opds-spec.org/acquisition/", "type": "application/opds-publication+json" } } }, { "href": "https://archive.org/download/cu31924011498676/cu31924011498676.epub", "type": "application/epub+zip", "rel": "http://opds-spec.org/acquisition/open-access", "properties": { "availability": "available", "more": { "href": "https://archive.org/services/loans/loan/?action=webpub&identifier=cu31924011498676&opds=1", "rel": "http://opds-spec.org/acquisition/", "type": "application/opds-publication+json" } } } ] }Screenshot
Stakeholders
@mekarpeles @MarcCoquand