Python has a generally been used for System Scripting tasks. Although it can certainly be used to generate more serious applications, scripts are something that Python is particularly well suited for. This is mostly because Python has a large library of utilities for various systems.
By using scripting languages the average person can gain more control over their system. A high level language that offers interfaces to lower level components makes it possible for people to write backup routines, strip content from the web, or do some computations that wouldn't be much fun by hand.
I figure here is as good a place as any to have some system command lists. Check it out(If you have good links for any of these please send it to me):
Microsoft Technet Command Line Reference for Windows NT+
Command line reference for Linux
Command line reference for Mac OS X
Command line reference for Mac OS Classic
Let's start with a simple example of downloading files from the popular web-comic, Penny Arcade. You can check out this game related humor/news site here. I wrote this script because myself and some of my friends are big fans, and we wanted to have access to this comics all the time. Poking around on the PA webserver a bit, I figured out where the comics were stored and decided to write this script:
"""
This is a simple script to download all of the
comic strips off of the penny arcade site for archival.
"""
import normalDate #This file needs to be in the path, or in the same directory
import string
import os
def download_lin():
"""<evil laugh>Using this nifty date class I can easily iterate
through the dates that conveniently fit the penny arcade naming
convention... </evil laugh>"""
'''
Just Change the year and the year on the directory in the
path variable and you can download the entire PA comic archive.
'''
for year in YEARS:
today = START[year]
# Depending on the year PA used either Jpegs or Gifs
# A more intelligent script could probably check to see
# which one is available, and then download appropriately.
# This one just bruteforce tries both types for everything.
type = "h.jpg"
type2 = "h.gif"
finish = FINISH[year]
while(today >= finish):
list,list2 = [],[]
list.append("%s" % today)
list2.append("%s" % today)
list.append(type)
list2.append(type2)
item = string.join(list,'')
item2 = string.join(list2,'')
# Notice this is the one thing that isn't crossplatform
path = string.join(["wget -nc", year]," ")
command, command2 = [], []
command.append(path)
command.append(item)
command2.append(path)
command2.append(item2)
# This is crossplatform, but the command string itself
# is not.
os.popen(string.join(command,''))
os.popen(string.join(command2,''))
today = today - 1
import urllib
def download_cross_platform():
for year in YEARS:
print year
today = START[year]
# Depending on the year PA used either Jpegs or Gifs
# A more intelligent script could probably check to see
# which one is available, and then download appropriately.
# This one just bruteforce tries both types for everything.
type = "h.jpg"
type2 = "h.gif"
finish = FINISH[year]
while(today >= finish):
list,list2 = [],[]
list.append("%s" % today)
list2.append("%s" % today)
list.append(type)
list2.append(type2)
item = string.join(list,'')
item2 = string.join(list2,'')
path = string.join([year,item],"")
path2 = string.join([year,item2],"")
print path
urllib.urlretrieve(path, item)
print path2
urllib.urlretrieve(path2, item2)
today = today - 1
# Useful Data values for the above functions:
YEARS = ["http://www.penny-arcade.com/images/1998/",
"http://www.penny-arcade.com/images/1999/",
"http://www.penny-arcade.com/images/2000/",
"http://www.penny-arcade.com/images/2001/",
"http://www.penny-arcade.com/images/2002/",
"http://www.penny-arcade.com/images/2003/"]
START = {"http://www.penny-arcade.com/images/1998/":normalDate.ND(19981231),
"http://www.penny-arcade.com/images/1999/":normalDate.ND(19991231),
"http://www.penny-arcade.com/images/2000/":normalDate.ND(20001231),
"http://www.penny-arcade.com/images/2001/":normalDate.ND(20011231),
"http://www.penny-arcade.com/images/2002/":normalDate.ND(20021231),
"http://www.penny-arcade.com/images/2003/":normalDate.ND(20031231)}
FINISH = {"http://www.penny-arcade.com/images/1998/":normalDate.ND(19980101),
"http://www.penny-arcade.com/images/1999/":normalDate.ND(19990101),
"http://www.penny-arcade.com/images/2000/":normalDate.ND(20000101),
"http://www.penny-arcade.com/images/2001/":normalDate.ND(20010101),
"http://www.penny-arcade.com/images/2002/":normalDate.ND(20020101),
"http://www.penny-arcade.com/images/2003/":normalDate.ND(20030101)}
if __name__=="__main__":
download_cross_platform()
Please note that this code relies on a type written by Jeff Bauer or Starship Python. The code is included here in the downloadedable example, and it is good code worth taking a look at. However, Mr. Bauer did a very nice job so I won't bother butchering his work here.
In the above code I'm using a very blunt instrument. Not a lot of finesse in this application, but then I just wanted a simple hack to download a lot of files from this page. So a bunch of data is hardcoded when it should be automagically generated, and I could try to not try every single date for the last 6 years. As it goes, this code will bombard the Penny Arcade server with about 2000 requests. I don't really recommend using this code too much, but I think it's a decent example.
Notice that there are two versions of the same code, one that works solely on linux/unix(where I originally wrote the script), and another that should work on any platform that can run Python. The key difference is a shift from reliance on a system application like "wget", to a Python library, namely "urllib". Both work fairly similarly, each takes a url and that's about it.
You've probably noticed from playing with examples that you run these scripts in the following way.
This isn't like running an executable at the command prompt, and can be a little confusing or just inelegant. A nifty little hack for Unix based systems is the addition of a single line at the beginning of each script you want to use as if it were a program. The line looks like this:
$ python download_pa_images.py
Additionally you'll have to make this file executable, and then run it like you normally would:
#!/usr/bin/python
# The above should always be the path of your python
$ chmod +x download_pa_images.py
$ ./download_pa_images.py
Alternatively you can use the env program to invoke Python regardless of where it's located on the users system, assuming you can find env.
#!/usr/bin/env python
This hack is unavailable on Windows. However, doubleclicking on python files in windows will cause them to be executed, which is a behavior that doesn't show up on the Unix side.
Another thing thats useful in scripting is defining what happens when the script file is run. Normally typing:
executes all the code in "myfile.py", and prints output if there is any, does computations, makes files, whatever. However, sometimes you want to define a set of functions, and execute specific ones when you call the file as a script. By defining a main method, you could specify that when you run "download_pa_images.py" you use the "download_lin()" function, or the "download_cross_platform()" script.
$ python myfile.py
Note: I'm playing with Zipfiles here because they are used on a pretty wide variety of systems. However, Python also has support for Gzip, Bzip, Tar, and possibly others. So don't think that you don't have options. These libraries are generally similar in usage, so I won't cover them here (unless someone asks).
Let's say that you really hate WinZip, but you like Python. You could do the following:
# Simple Application/Script to Compress a File or Directory
# Essentially you could use this instead of Winzip
# "I hate GUI Applications like WINZIP!!!"
# "Argh!"
# "I like Python though."
"""
Path can be a file or directory
Archname is the name of the to be created archive
"""
from zipfile import ZipFile, ZIP_DEFLATED
import os # File stuff
import sys # Command line parsing
def zippy(path, archive):
paths = os.listdir(path)
for p in paths:
p = os.path.join(path, p) # Make the path relative
if os.path.isdir(p): # Recursive case
zippy(p, archive)
else:
archive.write(p) # Write the file to the zipfile
return
def zipit(path, archname):
# Create a ZipFile Object primed to write
archive = ZipFile(archname, "w", ZIP_DEFLATED) # "a" to append, "r" to read
# Recurse or not, depending on what path is
if os.path.isdir(path):
zippy(path, archive)
else:
archive.write(path)
archive.close()
return "Compression of \""+path+"\" was successful!"
instructions = "zipit.py: Simple zipfile creation script." + \
"recursively zips files in a directory into" + \
"a single archive." +\
"e.g.: python zipit.py myfiles myfiles.zip"
# Notice the __name__=="__main__"
# this is used to control what Python does when it is called from the
# command line. I'm sure you've seen this in some of my other examples.
if __name__=="__main__":
if len(sys.argv) >= 3:
result = zipit(sys.argv[1], sys.argv[2])
print result
else:
print instructions
And of course we can unzip things too. You'll notice that unzipping is quite different. It's similiar enough to file reading with respect to just getting data out of the ZipFile that that isn't too confusing I hope. On the other hand, what is a little tricky is recreating all the directories that need to exist so that you can write out the appropriate file paths. A simple check with the os.path.dirname call takes care of it though as you can see. I can't stress enough how useful the online help for Python is.
# Simple script to Unzip archives created by
# our Zip Scripts.
import sys
import os
from zipfile import ZipFile, ZIP_DEFLATED
def unzip( path ):
# Create a ZipFile Object Instance
archive = ZipFile(path, "r", ZIP_DEFLATED)
names = archive.namelist()
for name in names:
if not os.path.exists(os.path.dirname(name)):
# Create that directory
os.mkdir(os.path.dirname(name))
# Write files to disk
temp = open(name, "wb") # create the file
data = archive.read(name) #read the binary data
temp.write(data)
temp.close()
archive.close()
return "\""+path+"\" was unzipped successfully."
instructions = "This script unzips plain jane zipfiles:"+\
"e.g.: python unzipit.py myfiles.zip"
if __name__=="__main__":
if len(sys.argv) == 2:
msg = unzip(sys.argv[1])
print msg
else:
print instructions
Now let's say we have a directory for of Movies, large text files or word documents, and music in the form of mp3's. Now in an effort to save disk space, and knowing that it's not uncommon for a 20 Megabyte textfile to take up only 1 Megabyte after compression, we figure we can save some space on our 300 Gigabyte Harddrive. (Ha!) Using our knowledge from the Zip scripts above, let's write a more specific script to handle the compression of a specific type of file on an entire tree of directories.
I'm currently using something like this to produce archives of all the sourcecode and examples in this tutorial. I could have used a program like tar or gzip, or even winzip, but doing it with Python is just more fun I think.
# Simple Application/Script to Compress files of a
# specific type in a Directory.
# I'm using this to build the packages of
# example codes for download that are
# on the first page of this tutorial.
# Yes: You could use Zip, Winzip, Tar, or whatever
# to pick out just the code files and so on, but this
# is just a little more fun.
"""
Path can be a file or directory
Archname is the name of the to be created archive
"""
from zipfile import ZipFile,ZIP_DEFLATED
import os
import sys
def zipty(path, archive, type):
paths = os.listdir(path)
for p in paths:
p = os.path.join(path, p)
if os.path.isdir(p):
zipty(p, archive,type)
elif os.path.splitext(p)[1] == type: #Just a little change here
archive.write(p)
return
def ziptype(path, archname, type):
# Create a ZipFile Object primed to write
archive = ZipFile(archname, "w",ZIP_DEFLATED) # "a" to append, "r" to read
# Recurse or not, depending on what path is
if os.path.isdir(path):
zipty(path, archive, type)
elif os.path.splitext(path)[1] == type: #This is all we're changing
archive.write(path)
else:
"Compression of \""+path+"\" was failed!" # And a failure message
archive.close()
return "Compression of \""+path+"\" was successful!"
instructions = "ziptype.py: Simple zipfile creation script." + \
"recursively zips files in a directory that match" +\
"a specific type."+\
"e.g.: python zipit.py myfiles .py myfiles.zip"
# Notice the __name__=="__main__"
# this is used to control what Python does when it is called from the
# command line. I'm sure you've seen this in some of my other examples.
if __name__=="__main__":
if len(sys.argv) == 4:
# ziptype("directory/path", "filetype", "archivename")
result = ziptype(sys.argv[1], sys.argv[3], sys.argv[2])
print result
else:
print instructions
That's about it for some scripting examples. I'd love to do some more diverse things, I kind of got sidetracked on Zipfiles here, but I found that I actually needed a simple zip utility, and now I've got one, and a pretty simple example. I'd like to apologize then to people who are interested in some other area that they'd like to script. In an attempt to make it up to those of you who feel slighted, please let me know what you think of these examples, and/or what you would rather see, and I'll add some new examples as soon as possible.
Download the source for this example here.