Bzip2 Howto David Fetter, dfetter@best.com v1.6 Tue Mar 10 17:48:42 PST 1998 This document tells how to use the new bzip2 compression program. The original sgml is here . 1. Introduction Bzip2 is a groovy new algorithm for compressing data. It generally makes files that are 60-70% of the size of their gzip'd counterparts. This document will take you through a few common applications for bzip2. French speakers may wish to refer to Arnaud Launay's French documents. The web version is here , and you can use ftp here Arnaud can be contacted by electronic mail at this address Japanese speakers may wish to refer to Tetsu Isaji's Japanese documents here . Isaji can be reached at his home page , or by electronic mail at this address. 1.1. Revision History 1.1.1. v1.6 Added TenThumbs' Netscape enabler. Also changed lesspipe.sh per his sugestion. It should work better now. 1.1.2. v1.5 Added Arnaud Launay's French translation, and his wu-ftpd file. 1.1.3. v1.4 Added Tetsu Isaji's Japanese translation. 1.1.4. v1.3 Added Ulrik Dickow's .emacs for 19.30 and higher. (Also corrected jka-compr.el patch for emacs per his suggestion. Oops! Bzip2's doesn't yet(?) have an "append" flag.) 1.1.5. v1.2 Changed patch for emacs so it automagically recognizes 1.1.6. v1.1 Added patch for emacs. 1.1.7. v1.0 Round 1. 2. Getting bzip2 Bzip2's home page is at The UK home site . The United States mirror site is here . You can also find it on Red Hat's ftp site here . 2.1. Getting bzip2 precompiled binaries See the home sites. Red Hat's intel binary is here . Debian's is here , and Slackware's is here . You can also get these in the analogous places at the various mirror sites. 2.2. Getting bzip2 sources They come from the Official sites (see ``Getting Bzip2'' for where, or Red Hat has it here ). 2.3. Compiling bzip2 for your machine If you have gcc 2.7.2.3, change the line that reads CFLAGS = -O3 -fomit-frame-pointer -funroll-loops to CFLAGS = -fomit-frame-pointer -funroll-loops that is, take out the -O3 part. After that, just make it and install it per the README. 3. Using bzip2 by itself Read the Fine Manual Page :) 4. Using bzip2 with tar Basically, there are two ways to use this, namely 4.1. Easier to set up: This method requires no setup at all. To un-tar the bzip2'd tar archive, foo.tar.bz in the current directory, do /path/to/bzip2 -cd foo.tar.bz2 | tar xf - This works, but can be a PITA to type often. 4.2. Easier to use: Apply the following patch to gnu tar 1.12, compile it, and install it, and you're good to go. Make sure that both tar and bzip2 are in your $PATH by "which tar" and "which bzip2. To use it, just do tar xyf foo.tar.bz2 to decompress the file. To make a new archive, it's similar: tar cyf foo.tar.bz2 file1 file2 file3...directory1 directory2... And here's the patch :) *** tar.c.orig Sat Apr 26 05:09:49 1997 --- tar.c Feb 2 00:50:47 1998 *************** *** 16,21 **** --- 16,24 ---- with this program; if not, write to the Free Software Foundation, Inc., 59 Place - Suite 330, Boston, MA 02111-1307, USA. */ + /* Feb 2 98: patched by David Fetter to use bzip2 as a + filter (option -y) */ + #include "system.h" #include *************** *** 196,201 **** --- 199,206 ---- {"block-number", no_argument, NULL, 'R'}, {"block-size", required_argument, NULL, OBSOLETE_BLOCKING_FACTOR}, {"blocking-factor", required_argument, NULL, 'b'}, + {"bzip2", required_argument, NULL, 'y'}, + {"bunzip2", required_argument, NULL, 'y'}, {"catenate", no_argument, NULL, 'A'}, {"checkpoint", no_argument, &checkpoint_option, 1}, {"compare", no_argument, NULL, 'd'}, *************** *** 372,377 **** --- 377,383 ---- PATTERN at list/extract time, a globbing PATTERN\n\ -o, --old-archive, --portability write a V7 format archive\n\ --posix write a POSIX conformant archive\n\ + -y, --bzip2, --bunzip2 filter the archive through bzip2\n\ -z, --gzip, --ungzip filter the archive through gzip\n\ -Z, --compress, --uncompress filter the archive through compress\n\ --use-compress-program=PROG filter through PROG (must accept -d)\n"), *************** *** 448,454 **** Y per-block gzip compression */ #define OPTION_STRING \ ! "-01234567ABC:F:GK:L:MN:OPRST:UV:WX:Zb:cdf:g:hiklmoprstuvwxz" static void set_subcommand_option (enum subcommand subcommand) --- 454,460 ---- Y per-block gzip compression */ #define OPTION_STRING \ ! "-01234567ABC:F:GK:L:MN:OPRST:UV:WX:Zb:cdf:g:hiklmoprstuvwxyz" static void set_subcommand_option (enum subcommand subcommand) *************** *** 805,810 **** --- 811,820 ---- case 'X': exclude_option = 1; add_exclude_file (optarg); + break; + + case 'y': + set_use_compress_program_option ("bzip2"); break; case 'z': 5. Using bzip2 with less To uncompress bzip2'd files on the fly, i.e. to be able to use "less" on them without first bunzip2'ing them, you can make a lesspipe.sh (man less) like this: #!/bin/sh # This is a preprocessor for 'less'. It is used when this environment # variable is set: LESSOPEN="|lesspipe.sh %s" case "$1" in *.tar) tar tvvf $1 2>/dev/null ;; # View contents of various tar'd files *.tgz) tar tzvvf $1 2>/dev/null ;; # This one work for the unmodified version of tar: *.tar.bz2) bzip2 -cd $1 $1 2>/dev/null | tar tzvvf - ;; #This one works with the patched version of tar: # *.tar.bz2) tyvvf $1 2>/dev/null ;; *.tar.gz) tar tzvvf $1 2>/dev/null ;; *.tar.Z) tar tzvvf $1 2>/dev/null ;; *.tar.z) tar tzvvf $1 2>/dev/null ;; *.bz2) bzip2 -dc $1 2>/dev/null ;; # View compressed files correctly *.Z) gzip -dc $1 2>/dev/null ;; *.z) gzip -dc $1 2>/dev/null ;; *.gz) gzip -dc $1 2>/dev/null ;; *.zip) unzip -l $1 2>/dev/null ;; *.1|*.2|*.3|*.4|*.5|*.6|*.7|*.8|*.9|*.n|*.man) FILE=`file -L $1` ; # groff src FILE=`echo $FILE | cut -d ' ' -f 2` if [ "$FILE" = "troff" ]; then groff -s -p -t -e -Tascii -mandoc $1 fi ;; *) cat $1 2>/dev/null ;; # *) FILE=`file -L $1` ; # Check to see if binary, if so -- view with 'strings' # FILE1=`echo $FILE | cut -d ' ' -f 2` # FILE2=`echo $FILE | cut -d ' ' -f 3` # if [ "$FILE1" = "Linux/i386" -o "$FILE2" = "Linux/i386" \ # -o "$FILE1" = "ELF" -o "$FILE2" = "ELF" ]; then # strings $1 # fi ;; esac 6. Using bzip2 with emacs 6.1. Changing emacs for everyone: I've written the following patch to jka-compr.el which adds bzip2 to auto-compression-mode. Disclaimer: I have only tested this with emacs-20.2, but have no reason to believe that a similar approach won't work with other versions. To use it, 1. Go to the emacs-20.2/lisp source directory (wherever you untarred it) 2. Put the patch below in a file called jka-compr.el.diff (it should be alone in that file ;). 3. Do patch < jka-compr.el.diff 4. Start emacs, and do M-x byte-compile-file jka-compr.el 5. Leave emacs. 6. Move your original jka-compr.elc to a safe place in case of bugs. 7. Replace it with the new jka-compr.elc. 8. Have fun! --- jka-compr.el Sat Jul 26 17:02:39 1997 +++ jka-compr.el.new Thu Feb 5 17:44:35 1998 @@ -44,7 +44,7 @@ ;; The variable, jka-compr-compression-info-list can be used to ;; customize jka-compr to work with other compression programs. ;; The default value of this variable allows jka-compr to work with -;; Unix compress and gzip. +;; Unix compress and gzip. David Fetter added bzip2 support :) ;; ;; If you are concerned about the stderr output of gzip and other ;; compression/decompression programs showing up in your buffers, you @@ -121,7 +121,9 @@ ;;; I have this defined so that .Z files are assumed to be in unix -;;; compress format; and .gz files, in gzip format. +;;; compress format; and .gz files, in gzip format, and .bz2 files, +;;; in the snappy new bzip2 format from http://www.muraroa.demon.co.uk. +;;; Keep up the good work, people! (defcustom jka-compr-compression-info-list ;;[regexp ;; compr-message compr-prog compr-args @@ -131,6 +133,10 @@ "compressing" "compress" ("-c") "uncompressing" "uncompress" ("-c") nil t] + ["\\.bz2\\'" + "bzip2ing" "bzip2" ("") + "bunzip2ing" "bzip2" ("-d") + nil t] ["\\.tgz\\'" "zipping" "gzip" ("-c" "-q") "unzipping" "gzip" ("-c" "-q" "-d") 6.2. Changing emacs for one person: Thanks for this one go to Ulrik Dickow, ukd@kampsax.dk , Systems Programmer at Kampsax Technology: To make it so you can use bzip2 automatically when you aren't the sysadmin, just add the following to your .emacs file. ;; Automatic (un)compression on loading/saving files (gzip(1) and similar) ;; We start it in the off state, so that bzip2(1) support can be added. ;; Code thrown together by Ulrik Dickow for ~/.emacs with Emacs 19.34. ;; Should work with many older and newer Emacsen too. No warranty though. ;; (if (fboundp 'auto-compression-mode) ; Emacs 19.30+ (auto-compression-mode 0) (require 'jka-compr) (toggle-auto-compression 0)) ;; Now add bzip2 support and turn auto compression back on. (add-to-list 'jka-compr-compression-info-list ["\\.bz2\\(~\\|\\.~[0-9]+~\\)?\\'" "zipping" "bzip2" () "unzipping" "bzip2" ("-d") nil t]) (toggle-auto-compression 1 t) 7. Using bzip2 with wu-ftpd Thanks to Arnaud Launay for this bandwidth saver. The following should go in /etc/ftpconversions to do on-the-fly compressions and decompressions with bzip2. Make sure that the paths (like /bin/compress) are right. :.Z: : :/bin/compress -d -c %s:T_REG|T_ASCII:O_UNCOMPRESS:UNCOMPRESS : : :.Z:/bin/compress -c %s:T_REG:O_COMPRESS:COMPRESS :.gz: : :/bin/gzip -cd %s:T_REG|T_ASCII:O_UNCOMPRESS:GUNZIP : : :.gz:/bin/gzip -9 -c %s:T_REG:O_COMPRESS:GZIP :.bz2: : :/bin/bzip2 -cd %s:T_REG|T_ASCII:O_UNCOMPRESS:BUNZIP2 : : :.bz2:/bin/bzip2 -9 -c %s:T_REG:O_COMPRESS:BZIP2 : : :.tar:/bin/tar -c -f - %s:T_REG|T_DIR:O_TAR:TAR : : :.tar.Z:/bin/tar -c -Z -f - %s:T_REG|T_DIR:O_COMPRESS|O_TAR:TAR+COMPRESS : : :.tar.gz:/bin/tar -c -z -f - %s:T_REG|T_DIR:O_COMPRESS|O_TAR:TAR+GZIP : : :.tar.bz2:/bin/tar -c -I -f - %s:T_REG|T_DIR:O_COMPRESS|O_TAR:TAR+BZIP2 8. Using bzip2 with Netscape under XWindows tenthumbs@cybernex.net says: I also found a way to get Linux Netscape to use bzip2 for Content- Encoding just as it uses gzip. Add this to $HOME/.Xdefaults or $HOME/.Xresources I use the -s option because I would rather trade some decompressing speed for RAM usage. You can leave the option out if you want to. Netscape*encodingFilters: \ x-compress : : .Z : uncompress -c \n\ compress : : .Z : uncompress -c \n\ x-gzip : : .z,.gz : gzip -cdq \n\ gzip : : .z,.gz : gzip -cdq \n\ x-bzip2 : : .bz2 : bzip2 -ds \n 9. Using bzip2 with xv I'm working on a patch which should let xv auto-decompress bzip2'd files the way it can do compress'd and gzip'd ones. Anybody want to help out?