Backing Up HTML Files of Hubs and Missing Images

Jump to Last Post 1-10 of 10 discussions (27 posts)
  1. eugbug profile image65
    eugbugposted 5 years ago

    This is something confusing that I have discovered. I normally backup the webpages of hubs and this creates a main HTML file and an auxiliary folder with the same name, containing java script files, .jpg files and .webp files. Now on one hub, I have 14 images, however there are only 7 JPG and WEBP images in this folder. When I open the main .HTML file, the page displays ok (loading these files from my backup, not the online version). So how is it accessing the missing images? Are they somehow embedded in the HTML file, or are they on a server somewhere in the cloud with the HTML only containing links? This  would mean that my backups aren't actually doing what they're supposed to do, downloading and storing all text and images in a hub.

    1. lobobrandon profile image65
      lobobrandonposted 5 years agoin reply to this

      Disconnect from the internet and try loading your page. Do you now have missing images?

    2. erorantes profile image47
      erorantesposted 5 years agoin reply to this

      Good morning mr. eugbug. The internet provider allow you to have access to many programs in your computer. Sometimes the computer might be missing a link that your internet provider need in order to download your images. Perhaps, it is time to buy a new cp and change your internet provider. One of my friends told me that Firefox is so behind. Stay updated with the internet. Also, try to find the answer at Github. They are expert with the subject.  I wish you a good weekend. Stay well.

  2. eugbug profile image65
    eugbugposted 5 years ago

    No, the images don't display. I checked the source of one of the images and it seems to be accessing it from the https://usercontent1.hubstatic.com folder? So I presume the content of this folder isn't stored when I save the page and the page looks online for the images when loaded. Now to complicate matters, I saved a hub today and it loads ok when I'm offline. This time, the photo source is reported as my local backup folder.



    https://hubstatic.com/15025810_f1024.jpg

  3. eugbug profile image65
    eugbugposted 5 years ago

    This seems to be the case with all saved pages, so now I've a ton of work to do, re-backing up everything.

    1. theraggededge profile image71
      theraggededgeposted 5 years agoin reply to this

      I've just done all mine. Again.

      What I've noticed a lot of is missing copyright notices sad

  4. Glenn Stok profile image65
    Glenn Stokposted 5 years ago

    If the source shows “https://“ at the beginning of files, then it’s accessing it from the web and not your local backup. Your backup method is not doing the job correctly. You need to figure out why.

    1. eugbug profile image65
      eugbugposted 5 years agoin reply to this

      If this is the case, does that mean that even if the browser saves the images, it still can't display them offline in the structure of the page because the code uses a hyperlink to an online location? Or should the browser automatically adjust the references?

      1. Glenn Stok profile image65
        Glenn Stokposted 5 years agoin reply to this

        The backup method should have changed all references to the local location in the HTML source code. I don’t know why your browser is doing it for some and not for all.

        Did you check to see if the image files have been saved in your backup even though the link is not correct?

        What browser and OS are you using? I’ll try to help.

      2. theraggededge profile image71
        theraggededgeposted 5 years agoin reply to this

        I just experimented by disconnecting from the wifi. The full hub shows up, but no photos. My files start with:

        file:///C:/Users/Beverley/Dropbox/Hubstuff/....

        They are saved as HTML files. As soon as I reconnect and refresh, the photos are back. I suspect this is due to HubPages' way of displaying the images through the capsules.

        Therefore, if HubPages was to disappear, that means only the text will display.

        Just out of interest, I checked my old Squidoo back ups, and again, all the text is there but the images are not, apart from the Amazon modules.

  5. lobobrandon profile image65
    lobobrandonposted 5 years ago

    I just saved a hub and see the same. The images on the hub do not download, the sidebar popular and related images are saved, though.

  6. lobobrandon profile image65
    lobobrandonposted 5 years ago

    An alternative, instead of manually saving every image is to save your hubs, the content and put some faith in the way back archive that it will stay online.

    I am not sure how to update a URL and save a new image, but if your hub is not in the archive you can add it:

    This page is available on the web!
    Help make the Wayback Machine more complete!

    Save this url in the Wayback Machine

    Here's the URL for others who do not know about it: https://web.archive.org/
    You would have to save your HP URLs though so that you can get to them later. Also archive your profile page just in case you lose some URLs you could get to them from your profile page. 

    The reason to do this: images are saved and you can download them

  7. eugbug profile image65
    eugbugposted 5 years ago

    Thanks Glenn. I though some pages I saved today were stored with the accompanying images. However I checked and it seemed they were loading them from the cache. When I cleared my history, images were no longer visible. Anyway no images are stored other than a profile pic and the main article image from the top of the page.  This is getting more mysterious because sometimes the images are in the backup folder when I save, even though I haven't changed anything, so it makes no sense.
    I'm using Firefox 76.0.1 and Windows 10.

  8. Glenn Stok profile image65
    Glenn Stokposted 5 years ago

    It works with Firefox 76.0.1 and Safari 13.1. You just need to tell it to save everything.

    Firefox:

    It works well with Firefox. It saves it with the main HTML and a separate folder containing all the other files and images.

    • Click File > Save Page As
    • Change Format to “Web Page, complete”
    • Click “save”

    There is no need to keep the javascript. So I like to delete all the “.js” files after the backup is created.


    Safari:

    You can do it with Safari too, but it saves it as a single file containing everything.

    • Click File > “Save as”
    • Change the default “Page Source” to “Web Archive”.
    • Click “save”

    Now you’ll save the images along with the HTML in a single “.webarchive” file.

    EDIT: I just tested again with both browsers and viewed the complete backups offline. Everything was there.

    1. eugbug profile image65
      eugbugposted 5 years agoin reply to this

      I noticed that save option, but it's the default for saving on my browser, however the image files still aren't saving. They did on one occasion, but I can't recreate how I managed to do it.

      1. Glenn Stok profile image65
        Glenn Stokposted 5 years agoin reply to this

        That's strange. I use the same version of Firefox and it works. The only difference is that I'm on a Mac.

        You say it used to work. Can you think of anything you might have changed—settings in Windows?

        1. eugbug profile image65
          eugbugposted 5 years agoin reply to this

          I was playing around with the ad-blocker. I usually turn it off because on several occasions I noticed that when I tried to save a page it failed because the page was waiting for ads to load. Only for the fact that I caught this message on the status bar at the bottom of the screen I could have had lots of incomplete backups. I notice on some Firefox forums that other people are having similar problems.

          1. Glenn Stok profile image65
            Glenn Stokposted 5 years agoin reply to this

            Maybe the problem is related to your ad blocker. That's a good thought to consider. I don't have the ad blocker enabled on either browser. I just checked. (I enable it sometimes when I test Amazon capsules, but that's not related to this issue.)

            EDIT:  Well what do you know!  I just enabled my ad blocker in Firefox (adblockplus) and the backup failed to keep the images. Good catch Eugene. That's the problem.

  9. eugbug profile image65
    eugbugposted 5 years ago

    I did disable it, but still no joy. This is a really frustrating problem. One of those problems that I end up staying up until the early hours of the morning to fix!

    1. Glenn Stok profile image65
      Glenn Stokposted 5 years agoin reply to this

      Try disabling ALL extensions on your browser. Maybe one of the others is affecting it.

  10. eugbug profile image65
    eugbugposted 5 years ago

    Ok, I think I have got to the bottom of what's causing this.
    I turned off all extensions but the problem remained. I noticed however that the first image in the hub was downloaded. So I tried scrolling down bit by bit, clearing the cache and saving each time. The further I scrolled down, the more images were downloaded. So it seems that the browser will only save images that are displayed, and not all content associated with the page.

    1. Glenn Stok profile image65
      Glenn Stokposted 5 years agoin reply to this

      Yes that is true. I always scroll down to the bottom before I back up so I get all the comments too. That's why to always got all the images for me.

      I just tried a backup without scrolling first, and I was able to reproduce the problem you were having.

      Problem solved. Good work Eugene.

      1. eugbug profile image65
        eugbugposted 5 years agoin reply to this

        I wonder is there a way to force Firefox to load the whole page? I have 110  articles to save so it would be nice to be able to make it a bit easier. Also could pages be saved in a batch by using a website ripper somehow? When we had subdomains, it was probably easier to do this because of the folder/URL structure.

        1. Glenn Stok profile image65
          Glenn Stokposted 5 years agoin reply to this

          Automating it to do all the hubs in a given list would be wonderful. We'd need a script to do that. I'll have to think about that.

          However, the "Avast Secure Browser" does a backup of an entire page without scrolling first. So that at least makes it a little easier.

          I just tested it and it saves all the images and the comments as well.

        2. NateB11 profile image85
          NateB11posted 5 years agoin reply to this

          I used to use the Scrapbook Add on, which I don't even know if it's still available because I've since gotten a different computer, so use different browser, etc. so lost all that years ago. But anyway, it was an easy way to save a bunch of articles. I just followed this guy's instructions, worked like a charm

          https://www.youtube.com/watch?v=2aUmTRrEsyE

          1. theraggededge profile image71
            theraggededgeposted 5 years agoin reply to this

            Evernote and Notion could do that too. With Evernote, you get the option to save a simplified version which gets rid of all the sidebar stuf, but keeps all the images. The problem is that the free version is limited.

            And there's Pocket - a bit more like Scrapbook. I think that may be dependent upon the original still being 'live'.

            Notion - well, I've recently fallen in love with it and bought the paid version. I've had Evernote for years but just don't love it much. Trouble is - now I can't remember where I put things big_smile

            1. NateB11 profile image85
              NateB11posted 5 years agoin reply to this

              I'll have to check those out. I did a brief search of them to get an idea, but too groggy right now to go in-depth. Thanks, Bev.

 
working

This website uses cookies

As a user in the EEA, your approval is needed on a few things. To provide a better website experience, hubpages.com uses cookies (and other similar technologies) and may collect, process, and share personal data. Please choose which areas of our service you consent to our doing so.

For more information on managing or withdrawing consents and how we handle data, visit our Privacy Policy at: https://corp.maven.io/privacy-policy

Show Details
Necessary
HubPages Device IDThis is used to identify particular browsers or devices when the access the service, and is used for security reasons.
LoginThis is necessary to sign in to the HubPages Service.
Google RecaptchaThis is used to prevent bots and spam. (Privacy Policy)
AkismetThis is used to detect comment spam. (Privacy Policy)
HubPages Google AnalyticsThis is used to provide data on traffic to our website, all personally identifyable data is anonymized. (Privacy Policy)
HubPages Traffic PixelThis is used to collect data on traffic to articles and other pages on our site. Unless you are signed in to a HubPages account, all personally identifiable information is anonymized.
Amazon Web ServicesThis is a cloud services platform that we used to host our service. (Privacy Policy)
CloudflareThis is a cloud CDN service that we use to efficiently deliver files required for our service to operate such as javascript, cascading style sheets, images, and videos. (Privacy Policy)
Google Hosted LibrariesJavascript software libraries such as jQuery are loaded at endpoints on the googleapis.com or gstatic.com domains, for performance and efficiency reasons. (Privacy Policy)
Features
Google Custom SearchThis is feature allows you to search the site. (Privacy Policy)
Google MapsSome articles have Google Maps embedded in them. (Privacy Policy)
Google ChartsThis is used to display charts and graphs on articles and the author center. (Privacy Policy)
Google AdSense Host APIThis service allows you to sign up for or associate a Google AdSense account with HubPages, so that you can earn money from ads on your articles. No data is shared unless you engage with this feature. (Privacy Policy)
Google YouTubeSome articles have YouTube videos embedded in them. (Privacy Policy)
VimeoSome articles have Vimeo videos embedded in them. (Privacy Policy)
PaypalThis is used for a registered author who enrolls in the HubPages Earnings program and requests to be paid via PayPal. No data is shared with Paypal unless you engage with this feature. (Privacy Policy)
Facebook LoginYou can use this to streamline signing up for, or signing in to your Hubpages account. No data is shared with Facebook unless you engage with this feature. (Privacy Policy)
MavenThis supports the Maven widget and search functionality. (Privacy Policy)
Marketing
Google AdSenseThis is an ad network. (Privacy Policy)
Google DoubleClickGoogle provides ad serving technology and runs an ad network. (Privacy Policy)
Index ExchangeThis is an ad network. (Privacy Policy)
SovrnThis is an ad network. (Privacy Policy)
Facebook AdsThis is an ad network. (Privacy Policy)
Amazon Unified Ad MarketplaceThis is an ad network. (Privacy Policy)
AppNexusThis is an ad network. (Privacy Policy)
OpenxThis is an ad network. (Privacy Policy)
Rubicon ProjectThis is an ad network. (Privacy Policy)
TripleLiftThis is an ad network. (Privacy Policy)
Say MediaWe partner with Say Media to deliver ad campaigns on our sites. (Privacy Policy)
Remarketing PixelsWe may use remarketing pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to advertise the HubPages Service to people that have visited our sites.
Conversion Tracking PixelsWe may use conversion tracking pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to identify when an advertisement has successfully resulted in the desired action, such as signing up for the HubPages Service or publishing an article on the HubPages Service.
Statistics
Author Google AnalyticsThis is used to provide traffic data and reports to the authors of articles on the HubPages Service. (Privacy Policy)
ComscoreComScore is a media measurement and analytics company providing marketing data and analytics to enterprises, media and advertising agencies, and publishers. Non-consent will result in ComScore only processing obfuscated personal data. (Privacy Policy)
Amazon Tracking PixelSome articles display amazon products as part of the Amazon Affiliate program, this pixel provides traffic statistics for those products (Privacy Policy)
ClickscoThis is a data management platform studying reader behavior (Privacy Policy)