How to make an offline copy of a website
To create an offline browseable copy of a website you can use the tool wget. I will guide you through the steps on how to create a offline copy based on your needs.
Download the website
wget --mirror --no-check-certificate -e robots=off --timestamping --recursive --level=inf \
--no-parent --page-requisites --convert-links --adjust-extension --backup-converted -U Mozilla \
--reject-regex feed -R '*.gz, *.tar, *.mp3, *.zip, *.flv, *.mpg, *.pdf' http://test.com
Change into the directory of the offline copy:
cd www.test.domain
Clean up temp files
wget creates some temp files, remove them:
find . -type f -name '*.orig' | xargs -n1 rm -f
Optimize images
If you want, you can convert JPEG images to a lower quality to save disk space:
find . -iname '*.jpg' | xargs -n1 mogrify -strip -quality 20
PNG files can be converted to JPEG files, but you have to keep the same filename with png ending to not break the offline website.
find . -name '*.png' | xargs -n1 mogrify -strip -quality 20 -format jpg
find . -name '*.PNG' | xargs -n1 mogrify -strip -quality 20 -format jpg
# rename the png files that got converted to jpg back to png
find . -name '*.png' -exec sh -c 'mv `dirname "$0"`/`basename "$0" .png`.jpg $0' '{}' \;
find . -name '*.PNG' -exec sh -c 'mv `dirname "$0"`/`basename "$0" .PNG`.jpg $0' '{}' \;
Read other posts