CSS paged media extensions create the print-based context of pages, kinds of pages for you to work with and PDF specific stuff like bookmarks. Using tools like PrinceXML or Antenna House you then generate the corresponding PDF.
I wouldn’t want a one-to-one reconciliation because, to me, PDF is more for reading offline in paper than it is reading online in a screen. That said there are a lot of things that you can do on the web that you can also do in a PDF document.
IMO, there is a significant difference between how information is presented on web and how it is put on the dead-tree—a one-to-one reconciliation of style and presentation isn’t possible. Although I could be wrong about this!
I know! No matter how hard I try, my sensibilities of a book aren’t able to reconcile with the idea of PDF as book. I know a lot of people rely on PDF for distribution, but to me PDF is and will always remain a dork.
I’d rather buy dead-tree despite the cost and the fact that I prefer digital for almost everything else.
Compare this to open MS Word / Google Docs → Create PDF. Simple. :-)
A lot of these “digitized” books are image rasters which aren’t searchable/indexable as text. Other than investing in expensive OCR tech along with sufficient intelligence to reprocess these rasters I do not see an easy way to salvage these older texts. Kudos to the team behind Internet Archive, it’s a splendid first step!
@marvindanig, how about getting these archives on Bubblin? Do you think it will be useful?
The article from New Yorker suggests that SEO is still king and real readers find articles of their interest via search. Nothing has changed!
There’s a lot to happen in the space of books and art starting this year. Which is a huge win for all of us, yay!
Kitchen Confidential sounds like an interesting read. :-)
This is a more authoritative list on books that are going to be in public domain in the United States in 2019.