Reviews for WebScrapBook
WebScrapBook by Danny Lin
Review by 14802265
Rated 2 out of 5
by 14802265, 6 years agoI tried this app to save webpages completely and accurately. It works on some pages like ghacks.net perfectly with scripted single html . On other pages like nytimes.com it captures the page out of sync even though all of the content seems to be there (large gap spaces, enlarged photos, etc.) Save Page WE has the same issue. On Washingtonpost.com WebScrapbook was almost perfect but there is a bug that will add incorrect characters if there is an apostrophe in the text(which in a news article there will undoubtedly be). I used scripted single html option on this also. I do have specific scripts for the Times and WPost running, but they are not the issue since Mozilla Archive Format and SingleFile always works perfectly on the same sites with the same scripts running. But since MAF doesnt work for current browsers and SingleFile works somewhat inconsistently (it stalls a lot), I was hoping WebScrapbook would work but no go.
Also, I havent seen an option to save the original page url either in the title or in the .html file for reference like MAF, Singlefile, or SavePage WE can.
I noticed the saved webpage nytimes.com icon was used in the tab, but Webscrapbook couldnt find the icon for washingtonpost.com tab. If the developer wants to see the output files, just tell me where to forward them.
This app might be able to save websites but if it cant do it accurately what's the point of using it.
Also, I havent seen an option to save the original page url either in the title or in the .html file for reference like MAF, Singlefile, or SavePage WE can.
I noticed the saved webpage nytimes.com icon was used in the tab, but Webscrapbook couldnt find the icon for washingtonpost.com tab. If the developer wants to see the output files, just tell me where to forward them.
This app might be able to save websites but if it cant do it accurately what's the point of using it.
Developer response
posted 6 years agoThank you for the feedback.
The issue on nytimes.com is same as the one with styled components and we are working on it (https://github.com/danny0838/webscrapbook/issues/109). It's a complicated issue as there are many things behind the scene to deal with. We almost have the solution but still need sometime to implement it, maybe next one or two revision.
I can't see an issue for washingtonpost.com, maybe it's really related with the scripts you've mentioned. Could you confirm it (by disabling your scripts and see if the issue's still there) and provide the scripts you are using, for further investigation?
The source page URL is recorded in the source code of the saved page but not shown directly. You'll be able to see it from the metadata if the backend server is used; otherwise you can see it from the source code. We are still investigating an appropriate way to present such metadata without altering the document too explictly.
As this addon site doesn't allow discussion, you can report issues to the source code repo (like the link provided above) so that we can discuss and trace them better:)
The issue on nytimes.com is same as the one with styled components and we are working on it (https://github.com/danny0838/webscrapbook/issues/109). It's a complicated issue as there are many things behind the scene to deal with. We almost have the solution but still need sometime to implement it, maybe next one or two revision.
I can't see an issue for washingtonpost.com, maybe it's really related with the scripts you've mentioned. Could you confirm it (by disabling your scripts and see if the issue's still there) and provide the scripts you are using, for further investigation?
The source page URL is recorded in the source code of the saved page but not shown directly. You'll be able to see it from the metadata if the backend server is used; otherwise you can see it from the source code. We are still investigating an appropriate way to present such metadata without altering the document too explictly.
As this addon site doesn't allow discussion, you can report issues to the source code repo (like the link provided above) so that we can discuss and trace them better:)