• After 15+ years, we've made a big change: Android Forums is now Early Bird Club. Learn more here.

How to copy an entire website?

Is it possible to copy an entire website, save it and then use it offline?

I'm a high school teacher and I was thinking of saving websites onto my Android tablets and then use them in class (no internet connection is avaiable, though).

That would be beyond awesome.

Thanks in advance, guys!
 
Is it possible to copy an entire website, save it and then use it offline?

I'm a high school teacher and I was thinking of saving websites onto my Android tablets and then use them in class (no internet connection is avaiable, though).

That would be beyond awesome.

Thanks in advance, guys!

I'm a teacher as well, same thing no internet in the classrooms. A lot of the times though, you can't actually save the whole website, mainly because with a CMS, e.g. Wordpress or Drupal, rather than static HTML, the pages are generated on the fly when you view them. You can't save the whole of Android Forums for off-line viewing, but you can save pages and copy/paste text of course.

But what I do if I wish to show the students a few web pages, do "Save page as" > "web page, complete" in the browser. This is on a PC with Firefox though. Can transfer the saved pages to the Android device and they will open and display OK.
 
Is it possible to copy an entire website, save it and then use it offline?

I'm a high school teacher and I was thinking of saving websites onto my Android tablets and then use them in class (no internet connection is avaiable, though).

That would be beyond awesome.

Thanks in advance, guys!

Windows Workstation <<>> Droid Maxx

It is generally not possible to copy the entire website and have it rendered correctly when offline. This is because the server in most cases is dynamically adjusting what is sent to the browser on the fly. To do it correctly the server must be present.

If the website is pure HTML/JavaScript then it is possible to have a working offline copy and all the external references it contains will be broken. Displaying the page in the browser and doing a save page will capture it in its entirety.

One tool was already mentioned for copying individual pages. It might work for you or not. I just tried it on my ASPX website and the background was not rendered and all links were "not available".

... Thom
 
Boot up a Linux live CD and use wget to mirror the entire website to a USB drive.

For example:
Code:
cd [I]/media/USBstick[/I]
wget -m http://www.[I]nameofsite.com[/I]

That's possible when a website is just plain static HTML pages.

Most websites there's days use a CMS(Content Management System), and HTML pages don't actually exist on the web server, they're generated by the CMS and downloaded to your browser. If you try it on a Wordpress site, you'll likely just get a copy of Wordpress, i.e. a load of PHP scripts and CSS. Same with ASP sites as well. That's if the site allows access to the ASP, PHP, CSS, etc. The actual content is held in database, and is used by the CMS to generate the relevant pages.

AF is on a CMS, all the pages we see here are generated on demand by vBulletin.

I'll often save whole web-pages for my classes, including pictures. And if I open them locally, without an internet connection, they do usually show OK.
 
Mike's quite right: only the most simple of sites is not database-backed these days.

You could potentially capture information from a site by taking snapshots of various pages, but you would not use much of the functionality.

For example, you could copy this page but you would not see any posts made after you took your copy and you'd not* be able to post.



* well, potentially you could if you had an internet connection .. and you correctly captured the URLs in the HTML .. and your browser was still logged in ..
 
Depending on your needs, you may be able to get by with just printing a copy of the web page to a PDF document. In this way you could be sure of the content and format of the offline web page, but any dynamic content (videos, animations, even links) will not be available. You can use the free CutePDF Writer to accomplish this if you don't already have the capability.
 
That's possible when a website is just plain static HTML pages.

Most websites there's days use a CMS(Content Management System), and HTML pages don't actually exist on the web server, they're generated by the CMS and downloaded to your browser. If you try it on a Wordpress site, you'll likely just get a copy of Wordpress, i.e. a load of PHP scripts and CSS. Same with ASP sites as well. That's if the site allows access to the ASP, PHP, CSS, etc. The actual content is held in database, and is used by the CMS to generate the relevant pages.
Except that I just tried it out, and it works on dynamically-generated content.

What doesn't work are internal links. But if, for example, you do a wget mirror on a WordPress site, you'll see individual blog entries will come down with the full conent, not just some empty PHP scripts and CSS. In fact, it'd be a huge security hole if you could download the PHP scripts from any LAMP server.
 
[...] What doesn't work are internal links. [...]

Yes, that's the problem here. I didn't just want to copy separate pages, I need the entire website (or at least part of it) and the possibility to move offline between its pages.

Thanks for your help, guys!
 
if it is just individual site pages, pocket could be useful. i'm always saving to pocket and then reading stuff off line.
 
Back
Top Bottom