This document aims to provide basic usage information for Hwrt, and is provided without warranty of any kind. Good luck!

Installing Hwrt

Hwrt is just a shell script. Just put it wherever you put other scripts and make sure it's executable:

mkdir ~/bin # maybe you don't have a "~/bin" at all
echo 'PATH="$PATH:$HOME/bin"'>>.profile # if needed
mv hwrt.sh ~/bin/hwrt # if you don't like seeing ".sh"
chmod +x ~/bin/hwrt # change file mode to "executable"
type -p hwrt # make sure hwrt is in your path

Don't just copy-paste that without thinking — it was meant as an example for inexperienced users.

Running Hwrt

After putting Hwrt somewhere in your path and rendering it executable, cross your fingers and type[1] something like

hwrt source destination_directory

If your destination directory (which, I might add, must exist before you run Hwrt) is "public_html", you can omit the second argument; moreover, if your source node's name is a regular file whose name has a stem of "index", you can omit the first argument, too.

Configuring Hwrt

In the sections that follow, a file called “.hwrt_profile” is mentioned several times; it is a configuration file from which Hwrt can read default values for many variables.

Ordinarily, Hwrt looks for this file in the directory from which it is invoked, but you can use something like

hwrt -p my_dir/my_hwrt_profile

to specify a different location for the configuration file.

Keeping Hwrt Quiet

By default, Hwrt prints a period for every node that it visits; if you want to suppress this output during a given invocation, you can type

hwrt -q

If you want to suppress default output for every invocation, put

HWRT_VERBOSITY=0

in your “.hwrt_profile”.

Watching Hwrt's Progress

If you want to watch Hwrt crawl[2] (maybe because you're bored) type

hwrt -v

instead, which causes Hwrt to output a tree-like representation of your hypertext web “as it happens”. This may come in handy if your pageset is broken in some obscure way and you are having trouble making sense of the trace output in the logs.

Making Hwrt Stop

There are two ways in which Hwrt can be made to stop:

  1. The session crashes and every process in it is terminated as you are logged out.
  2. You issue a rapid succession of control-c's, hoping to outpace Hwrt and thus regain control of the terminal.

In either case, Hwrt's files may end up in an inconsistent state; for example, some cache entries may have a “current” timestamp (and thus be deemed valid by Hwrt) but incomplete contents. Hwrt attempts to recover automatically from such accidents and, in general, manages quite well; however, should Hwrt not manage to recover from an interruption, you can force a “recovery” by using the -r switch:

hwrt -r

Forcing Hwrt to Go On

By default, Hwrt tries to avoid uploading a broken pageset; however, if you want Hwrt to ignore all errors and soldier on, you can use

hwrt -i

This may be useful if you are in a great hurry and just want to upload the part that's working before looking at the error messages.

To make this the default behavior, put

HWRT_IGNORE_ERRORS=1

in your “.hwrt_profile”.

You can achieve a similar effect by specifying

hwrt -l 0

which sets the log level to zero and, thereby, instructs Hwrt not to keep track of errors at all. This means that things will fail silently, and that there will be no trace information in the logs; therefore, be careful when you use this feature.

Making Hwrt Log Everything

As you may have guessed from the foregoing section, the log level switch allows you to tell Hwrt how thorough a record of its action it should keep. Increasing the log level from the default 1 to 2 enables logging of warnings, while typing

hwrt -l 3

will cause operations like clobbering and copying to be logged to the messages file.

Where to Find Log Files

You don't have to remember this at all. No, really. Whenever a log file is non-empty, Hwrt tells you where it is and suggests that you look at it — unless there are no errors and you've told Hwrt to be quiet, in which case Hwrt tells you nothing at all. OK, so occasionally you need to know where the log files are; in this unlikely event, you would look for files with stems like “errors”, “warnings”, “messages”, and in a directory called “.hwrt” found wherever Hwrt was invoked from.

Enabling Automatic Uploads

To enable automatic pageset uploads, put something like

HWRT_SITE_URL="http://example.com/~user/" # [3]
HWRT_REMOTE_TARGET_ROOT_PATH="user@example.com:www/~user/"
HWRT_AUTO_UPLOAD="1"

in your “.hwrt_profile”.

On my system, I actually leave the HWRT_AUTO_UPLOAD out and simply type

hwrt -u

whenever I do want the automatic upload to happen. Note that, when using such a configuration, you don't have to invoke Hwrt again if you forget the -u: if your “.hwrt_profile” is set up as above, Hwrt will suggest an rsync command that you can simply copy-paste and execute.

Installing Your Google Verification File

If you use Google Sitemaps, you can ensure the presence and prevent the removal of your verification file by including something like

HWRT_VERIFICATION_FILE_NAME=googlea3d2c4f6a0a5e197.html

in your “.hwrt_profile”.


[1] I know that typing with your fingers crossed is hard, but you do want the program to work, don't you?

[2] Really, though — it's a very slow program.

[3] The value of the HWRT_SITE_URL variable is used to generate absolute URLs when composing the sitemap file.