• Sales: 1-800-961-2888
  • Support: 1-800-961-4454

Use cron to extract and compress (zip & unzip) on Cloud Sites


Scripting extraction and compression of files

IMPORTANT NOTE: Please make sure that you set up the cron job "command type" to be Perl to properly execute the shell script (.sh file). Cron jobs that run over 900 seconds (15 minutes) are automatically terminated.

For these examples make sure to replace "/SOURCE/DIRECTORY/" and "/DESTINATION DIRECTORY" with the appropriate Web directories (like "/mnt/stor1-wc1-dfw1/123456/www.example.com/web/content/archives/").

Compressing

If you're scripting file compression, create a file named "compress.sh" to start.

Zip compression

To compress a directory to zip format add these lines to the script:

#!/bin/sh
zip -9pr /DESTINATION/DIRECTORY/file.zip /SOURCE/DIRECTORY/  

Where "file.zip" is the name that you assign to the zip file.

If you're only compressing a single file the script would be similar, but would not require the "r" in the options:

#!/bin/sh
zip -9p /DESTINATION/DIRECTORY/file.zip /SOURCE/DIRECTORY/targetfile.txt

Tar archiving

Put this in the script to archive a directory in tar format:

#!/bin/sh
tar -cvf /DESTINATION/DIRECTORY/file.tar /SOURCE/DIRECTORY/   

Where "file.tar" is the name that you assign to the archive file.

Tar.gz compression

Put this in the script to archive and compress a directory into a gzipped tar format:

#!/bin/sh
tar -cvzf /DESTINATION/DIRECTORY/file.tar.gz /SOURCE/DIRECTORY/   

Where "file.tar.gz" is the name that you assign to the compressed file.

Extracting

If you're scripting file extraction, create a file named "decompress.sh" to start.

Zip extraction

Add these lines to decompress from zip format:

#!/bin/sh                                           
unzip -o /SOURCE/DIRECTORY/file.zip -d /DESTINATION/DIRECTORY/

Where "file.zip" is the name of the zip file to be uncompressed.

(NOTE: the -o option will force unzip to overwrite existing files!)

Tar extraction

Put this in the script to extract from tar format:

#!/bin/sh  
tar -xvf /SOURCE/DIRECTORY/file.tar -C /DESTINATION/DIRECTORY/

Where "file.tar" is the name that you assign to the compressed file.

Tar.gz extraction

Put this in the script to extract from tar format:

#!/bin/sh  
tar -xvzf /SOURCE/DIRECTORY/file.tar.gz -C /DESTINATION/DIRECTORY/

Where "file.tar.gz" is the name of the compressed file.

Cron FAQ's:

What is a cron job?

How do I enable or disable cron?

How do I schedule a cron job?

How do I import a large mysql database?



© 2011-2013 Rackspace US, Inc.

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License


See license specifics and DISCLAIMER

15 Comments

I wish you guys would make a better way for us to unzip files that have been uploaded via FTP. A web interface would be nice... or maybe some limited shell access.

Thanks for the feedback Joe. You might post that suggestion to our feature request forum to make sure it's seen by the right eyes. The address is: http://feedback.rackspacecloud.com/forums/71021-product-feedback

I have battled with this for YEARS!!!
So, here is my solution. Running this script will allow you to browse/manage files on your server also allowing you to zip and unzip!
http://extplorer.sourceforge.net/

I have a large zip archive of an old site to extract in /{webdir}/legacy. I created a file in that directory decompress.sh and included these two lines in the file:

#!/bin/sh
tar -xzvf /{webdir}/legacy/legacy.tar

I then set up a PERL type crontab with the following command:

/{webdir}/legacy/decompress.sh

Right? Wrong? Not seeing any action.

It sounds like we need to tweak the extraction command based on the archived file's extension. If it is just "legacy.tar" you'll want to leave the "z" out of the tar command's options, as in: "tar -xvf /{webdir}/legacy/legacy.tar".

I found these instructions incorrect -- at least in my case. When I substituted the /SOURCE/DIRECTORY/ and /DESTINATION/DIRECTORY/ within my script, it did not work for me with the leading space. I had to remove the leading spaces and only use SOURCE/DIRECTORY/ and SOURCE/DIRECTORY/

Will this happen in only some cases? Or should these instructions be corrected?

Thanks,
Anna

Whether or not the leading slash is necessary is a question of how you're trying to point to a directory. Short version: If it works, great. Keep using it. With or without the leading slash, they're just two different ways to refer to the same stuff.

The difference between "/SOURCE/DIRECTORY" and "SOURCE/DIRECTORY" is that the first is an absolute path and the second describes a relative path.

The absolute path defines the path as if you were starting from the root directory of the server, "/", and describing it from there. The example at the beginning of the article that starts with "/mnt" is an example of an absolute path.

The one without the leading slash, "SOURCE/DIRECTORY", is called a relative path. If you're just specifying where the file is in relation to the account's home directory then the system fills in the home directory where that leading slash would be in an absolute path.

An absolute path can be safer in a script since it doesn't depend on where the system thinks its working directory is. But if you've specified a relative path that works with the script then you shouldn't have a problem and can keep using it in your script.

I also had to remove the / in front of the directories in order for the script to run. By putting the / in front the directories, the script returned an error saying it couldn't find the .zip file. Once I edited the script and removed the / and it ran fine.

Nice tutorial. I tried it with zip files and it worked. What option can be included in the script to automatically delete the zip file after extraction?

I too want the script to delete the file after compression is done. Is this not possible?

It is possible. The simplest way would be to add an "rm -f /SOURCE/DIRECTORY/targetfile.txt" line to the end. However, a safer approach would use the "&&" operator to make it so the removal only happens if the zip operation succeeded, as in:

#!/bin/sh
zip -9p /DESTINATION/DIRECTORY/file.zip /SOURCE/DIRECTORY/targetfile.txt && rm -f /SOURCE/DIRECTORY/targetfile.txt

The "zip" and "rm" commands would all be on the same line, separated by the "&&".

Just make it easy to backup all of our sites. Is that too much to ask?

Put a button on our main account page to backup all our stuff.

I missed the main point of this page ... that we're creating a text file. Look at this page and all you see are command lines, so I was looking all over for a command line interface on my Rackspace Cloud Site (which does not exist). You might open this page with something like "In this tutorial you'll create a text file, upload that text file, then create a scheduled task (cron job) to execute the commands in the file. Note: commands are typed into a file because your Cloud Site does not support terminal login."

An excellent point, thanks Eric. We'll get something into this article to make the purpose more clear.

If you're using the Windows hosting service you still need to use the Linux path formats.

Add new comment