Backing Up HTML Files of Hubs and Missing Images

Jump to Last Post 1-10 of 10 discussions (27 posts)

61
eugbugposted 6 years ago
This is something confusing that I have discovered. I normally backup the webpages of hubs and this creates a main HTML file and an auxiliary folder with the same name, containing java script files, .jpg files and .webp files. Now on one hub, I have 14 images, however there are only 7 JPG and WEBP images in this folder. When I open the main .HTML file, the page displays ok (loading these files from my backup, not the online version). So how is it accessing the missing images? Are they somehow embedded in the HTML file, or are they on a server somewhere in the cloud with the HTML only containing links? This would mean that my backups aren't actually doing what they're supposed to do, downloading and storing all text and images in a hub.
reply report
1. 61
  lobobrandonposted 6 years agoin reply to this
  Disconnect from the internet and try loading your page. Do you now have missing images?
  reply report
2. 47
  erorantesposted 6 years agoin reply to this
  Good morning mr. eugbug. The internet provider allow you to have access to many programs in your computer. Sometimes the computer might be missing a link that your internet provider need in order to download your images. Perhaps, it is time to buy a new cp and change your internet provider. One of my friends told me that Firefox is so behind. Stay updated with the internet. Also, try to find the answer at Github. They are expert with the subject. I wish you a good weekend. Stay well.
  reply report
61
eugbugposted 6 years ago
No, the images don't display. I checked the source of one of the images and it seems to be accessing it from the https://usercontent1.hubstatic.com folder? So I presume the content of this folder isn't stored when I save the page and the page looks online for the images when loaded. Now to complicate matters, I saved a hub today and it loads ok when I'm offline. This time, the photo source is reported as my local backup folder.

reply report
61
eugbugposted 6 years ago
This seems to be the case with all saved pages, so now I've a ton of work to do, re-backing up everything.
reply report
1. 60
  theraggededgeposted 6 years agoin reply to this
  I've just done all mine. Again.
  
  What I've noticed a lot of is missing copyright notices
  reply report
61
Glenn Stokposted 6 years ago
If the source shows “https://“ at the beginning of files, then it’s accessing it from the web and not your local backup. Your backup method is not doing the job correctly. You need to figure out why.
reply report
1. 61
  eugbugposted 6 years agoin reply to this
  If this is the case, does that mean that even if the browser saves the images, it still can't display them offline in the structure of the page because the code uses a hyperlink to an online location? Or should the browser automatically adjust the references?
  reply report
  61
  Glenn Stokposted 6 years agoin reply to this
  The backup method should have changed all references to the local location in the HTML source code. I don’t know why your browser is doing it for some and not for all.
  
  Did you check to see if the image files have been saved in your backup even though the link is not correct?
  
  What browser and OS are you using? I’ll try to help.
  reply report
  60
  theraggededgeposted 6 years agoin reply to this
  I just experimented by disconnecting from the wifi. The full hub shows up, but no photos. My files start with:
  
  file:///C:/Users/Beverley/Dropbox/Hubstuff/....
  
  They are saved as HTML files. As soon as I reconnect and refresh, the photos are back. I suspect this is due to HubPages' way of displaying the images through the capsules.
  
  Therefore, if HubPages was to disappear, that means only the text will display.
  
  Just out of interest, I checked my old Squidoo back ups, and again, all the text is there but the images are not, apart from the Amazon modules.
  reply report
61
lobobrandonposted 6 years ago
I just saved a hub and see the same. The images on the hub do not download, the sidebar popular and related images are saved, though.
reply report
61
lobobrandonposted 6 years ago
An alternative, instead of manually saving every image is to save your hubs, the content and put some faith in the way back archive that it will stay online.

I am not sure how to update a URL and save a new image, but if your hub is not in the archive you can add it:

This page is available on the web!
Help make the Wayback Machine more complete!

Save this url in the Wayback Machine

Here's the URL for others who do not know about it: https://web.archive.org/
You would have to save your HP URLs though so that you can get to them later. Also archive your profile page just in case you lose some URLs you could get to them from your profile page.

The reason to do this: images are saved and you can download them
reply report
61
eugbugposted 6 years ago
Thanks Glenn. I though some pages I saved today were stored with the accompanying images. However I checked and it seemed they were loading them from the cache. When I cleared my history, images were no longer visible. Anyway no images are stored other than a profile pic and the main article image from the top of the page. This is getting more mysterious because sometimes the images are in the backup folder when I save, even though I haven't changed anything, so it makes no sense.
I'm using Firefox 76.0.1 and Windows 10.
reply report
61
Glenn Stokposted 6 years ago
It works with Firefox 76.0.1 and Safari 13.1. You just need to tell it to save everything.

Firefox:

It works well with Firefox. It saves it with the main HTML and a separate folder containing all the other files and images.

• Click File > Save Page As
• Change Format to “Web Page, complete”
• Click “save”

There is no need to keep the javascript. So I like to delete all the “.js” files after the backup is created.

Safari:

You can do it with Safari too, but it saves it as a single file containing everything.

• Click File > “Save as”
• Change the default “Page Source” to “Web Archive”.
• Click “save”

Now you’ll save the images along with the HTML in a single “.webarchive” file.

EDIT: I just tested again with both browsers and viewed the complete backups offline. Everything was there.
reply report
1. 61
  eugbugposted 6 years agoin reply to this
  I noticed that save option, but it's the default for saving on my browser, however the image files still aren't saving. They did on one occasion, but I can't recreate how I managed to do it.
  reply report
  61
  Glenn Stokposted 6 years agoin reply to this
  That's strange. I use the same version of Firefox and it works. The only difference is that I'm on a Mac.
  
  You say it used to work. Can you think of anything you might have changed—settings in Windows?
  reply report
  61
  eugbugposted 6 years agoin reply to this
  I was playing around with the ad-blocker. I usually turn it off because on several occasions I noticed that when I tried to save a page it failed because the page was waiting for ads to load. Only for the fact that I caught this message on the status bar at the bottom of the screen I could have had lots of incomplete backups. I notice on some Firefox forums that other people are having similar problems.
  reply report
  61
  Glenn Stokposted 6 years agoin reply to this
  Maybe the problem is related to your ad blocker. That's a good thought to consider. I don't have the ad blocker enabled on either browser. I just checked. (I enable it sometimes when I test Amazon capsules, but that's not related to this issue.)
  
  EDIT: Well what do you know! I just enabled my ad blocker in Firefox (adblockplus) and the backup failed to keep the images. Good catch Eugene. That's the problem.
  reply report
61
eugbugposted 6 years ago
I did disable it, but still no joy. This is a really frustrating problem. One of those problems that I end up staying up until the early hours of the morning to fix!
reply report
1. 61
  Glenn Stokposted 6 years agoin reply to this
  Try disabling ALL extensions on your browser. Maybe one of the others is affecting it.
  reply report
61
eugbugposted 6 years ago
Ok, I think I have got to the bottom of what's causing this.
I turned off all extensions but the problem remained. I noticed however that the first image in the hub was downloaded. So I tried scrolling down bit by bit, clearing the cache and saving each time. The further I scrolled down, the more images were downloaded. So it seems that the browser will only save images that are displayed, and not all content associated with the page.
reply report
1. 61
  Glenn Stokposted 6 years agoin reply to this
  Yes that is true. I always scroll down to the bottom before I back up so I get all the comments too. That's why to always got all the images for me.
  
  I just tried a backup without scrolling first, and I was able to reproduce the problem you were having.
  
  Problem solved. Good work Eugene.
  reply report
  61
  eugbugposted 6 years agoin reply to this
  I wonder is there a way to force Firefox to load the whole page? I have 110 articles to save so it would be nice to be able to make it a bit easier. Also could pages be saved in a batch by using a website ripper somehow? When we had subdomains, it was probably easier to do this because of the folder/URL structure.
  reply report
  61
  Glenn Stokposted 6 years agoin reply to this
  Automating it to do all the hubs in a given list would be wonderful. We'd need a script to do that. I'll have to think about that.
  
  However, the "Avast Secure Browser" does a backup of an entire page without scrolling first. So that at least makes it a little easier.
  
  I just tested it and it saves all the images and the comments as well.
  reply report
  88
  NateB11posted 6 years agoin reply to this
  I used to use the Scrapbook Add on, which I don't even know if it's still available because I've since gotten a different computer, so use different browser, etc. so lost all that years ago. But anyway, it was an easy way to save a bunch of articles. I just followed this guy's instructions, worked like a charm
  
  https://www.youtube.com/watch?v=2aUmTRrEsyE
  reply report
  60
  theraggededgeposted 6 years agoin reply to this
  Evernote and Notion could do that too. With Evernote, you get the option to save a simplified version which gets rid of all the sidebar stuf, but keeps all the images. The problem is that the free version is limited.
  
  And there's Pocket - a bit more like Scrapbook. I think that may be dependent upon the original still being 'live'.
  
  Notion - well, I've recently fallen in love with it and bought the paid version. I've had Evernote for years but just don't love it much. Trouble is - now I can't remember where I put things
  reply report
  88
  NateB11posted 6 years agoin reply to this
  I'll have to check those out. I did a brief search of them to get an idea, but too groggy right now to go in-depth. Thanks, Bev.
  reply report

Post a Reply

jump to first post

Necessary
HubPages Device ID	This is used to identify particular browsers or devices when the access the service, and is used for security reasons.
Login	This is necessary to sign in to the HubPages Service.
Google Recaptcha	This is used to prevent bots and spam. (Privacy Policy)
Akismet	This is used to detect comment spam. (Privacy Policy)
HubPages Google Analytics	This is used to provide data on traffic to our website, all personally identifyable data is anonymized. (Privacy Policy)
HubPages Traffic Pixel	This is used to collect data on traffic to articles and other pages on our site. Unless you are signed in to a HubPages account, all personally identifiable information is anonymized.
Amazon Web Services	This is a cloud services platform that we used to host our service. (Privacy Policy)
Cloudflare	This is a cloud CDN service that we use to efficiently deliver files required for our service to operate such as javascript, cascading style sheets, images, and videos. (Privacy Policy)
Google Hosted Libraries	Javascript software libraries such as jQuery are loaded at endpoints on the googleapis.com or gstatic.com domains, for performance and efficiency reasons. (Privacy Policy)

Features
Google Custom Search	This is feature allows you to search the site. (Privacy Policy)
Google Maps	Some articles have Google Maps embedded in them. (Privacy Policy)
Google Charts	This is used to display charts and graphs on articles and the author center. (Privacy Policy)
Google AdSense Host API	This service allows you to sign up for or associate a Google AdSense account with HubPages, so that you can earn money from ads on your articles. No data is shared unless you engage with this feature. (Privacy Policy)
Google YouTube	Some articles have YouTube videos embedded in them. (Privacy Policy)
Vimeo	Some articles have Vimeo videos embedded in them. (Privacy Policy)
Paypal	This is used for a registered author who enrolls in the HubPages Earnings program and requests to be paid via PayPal. No data is shared with Paypal unless you engage with this feature. (Privacy Policy)
Facebook Login	You can use this to streamline signing up for, or signing in to your Hubpages account. No data is shared with Facebook unless you engage with this feature. (Privacy Policy)
Maven	This supports the Maven widget and search functionality. (Privacy Policy)

Marketing
Google AdSense	This is an ad network. (Privacy Policy)
Google DoubleClick	Google provides ad serving technology and runs an ad network. (Privacy Policy)
Index Exchange	This is an ad network. (Privacy Policy)
Sovrn	This is an ad network. (Privacy Policy)
Facebook Ads	This is an ad network. (Privacy Policy)
Amazon Unified Ad Marketplace	This is an ad network. (Privacy Policy)
AppNexus	This is an ad network. (Privacy Policy)
Openx	This is an ad network. (Privacy Policy)
Rubicon Project	This is an ad network. (Privacy Policy)
TripleLift	This is an ad network. (Privacy Policy)
Say Media	We partner with Say Media to deliver ad campaigns on our sites. (Privacy Policy)
Remarketing Pixels	We may use remarketing pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to advertise the HubPages Service to people that have visited our sites.
Conversion Tracking Pixels	We may use conversion tracking pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to identify when an advertisement has successfully resulted in the desired action, such as signing up for the HubPages Service or publishing an article on the HubPages Service.

Statistics
Author Google Analytics	This is used to provide traffic data and reports to the authors of articles on the HubPages Service. (Privacy Policy)
Comscore	ComScore is a media measurement and analytics company providing marketing data and analytics to enterprises, media and advertising agencies, and publishers. Non-consent will result in ComScore only processing obfuscated personal data. (Privacy Policy)
Amazon Tracking Pixel	Some articles display amazon products as part of the Amazon Affiliate program, this pixel provides traffic statistics for those products (Privacy Policy)
Clicksco	This is a data management platform studying reader behavior (Privacy Policy)

Backing Up HTML Files of Hubs and Missing Images

Arts and Design

Autos

Books, Literature, and Writing

Business and Employment

Education and Science

Entertainment and Media

Family and Parenting

Fashion and Beauty

Food and Cooking

Games, Toys, and Hobbies

Gender and Relationships

Health

Holidays and Celebrations

Home and Garden

HubPages Tutorials and Community

Personal Finance

Pets and Animals

Politics and Social Issues

Religion and Philosophy

Sports and Recreation

Technology

Travel and Places

About Us

This website uses cookies

Backing Up HTML Files of Hubs and Missing Images

Related Discussions

This website uses cookies