	                   GETLEFT

- Introduction

- What this program in not

- What this program has not

- Where to get it

- Requirements

- Menus

- Credits



Introduction

Once upon a time I tried to download one month worth of a mailing
list archive. I tried to do it with Getright, but that program
can only process up to 500 links in a web page, this archive had
well over one thousand. At the same time, I learned that Tcl could
download files, so I thought "How hard can it be to make my own
program?".

So here is my little effort, or at least an early version of it.
It is supposed to download complete Web sites. You give it an URL,
and down it goes on, happily downloading every linked URL in
that site, or even in other sites.

While it goes, it changes the original pages, all the links
get changed to relative links. so that you can surf the site in
your hard disk without those pesky absolute links.

This is an alpha release, that means it may behave in an erratic
way or fail to work at all. It kind of works for me, but be advised
it may not for you.


What this program is not

A Getright substitute, I have seen in Deja.com that some people
seem to think so, while I get a big ego boost from reading it,
the fact remains it isn't.

If you came here searching for a Linux anwser to Getright, look for
'Caitoo' part of the Kde project or 'Downloader for X' for Gnome
(But please, give Getleft a try)

What this program has not

A capital 'L', believe me, one capital per word is good enough for me.

What this program lacks

Getleft doesn't support Java, Javascript, ... It only understands plain
Html.


Where to get it

You can check which is the last version of the program in freshmeat:

http://freshmeat.net/appindex/1999/07/17/932219913.html

While I get my own home page, this will have to do.


Requirements

It works on Linux, I guess Unix variants will be fine, and Windows.
It looks and works better on Linux though.

The program requires Tcl/Tk 8.1 or newer. It will, most definitely,
not work with earlier versions due to the regular expressions used.

In Linux you can get the version you have got by doing:

wish
% puts $tcl_version

In most cases it will return something like: 8.0, as the 8.1 version
is about a year old and you can't expect Linux distributors to react
that quickly to change (In fact, Tcl is already at version 8.3.1).

For example. The latest versions of Openlinux, Red Hat, Mandrake and
Suse have shipped with Tcl/Tk 8.0.x.

You can find 'rpms' for Tcl/Tk 8.2 in RedHat's contrib site.

In windows, if you do not know what version you have, you probably
do not have any.

If you don't know what I am talking about, check 
www.scriptics.com, the company that makes the Tcl/Tk interpreter
(don't panic, it's free).

To do the actual download the Getleft uses, the program 'curl'
you can get it in it's author, Daniel Stenberg, web page:

http://curl.haxx.nu

And don't you worry, it's also free (Am I cheap or what?)

In Unix/Linux world that's it, for Windows you need the
executables found in the win.exe self-extracting archive.
Put these file together with curl for Windows in the same
directory where 'Getleft.tcl' is

These files for Windows come from the Cygwin project by Cygnus:

http://sourceware.cygnus.com/cygwin/

Menus

The following is a short description of what the options in the
menus do, or at least what they are meant to do.


File menu


Enter URL

A small dialog box appears, whatever is in the clipboard will appear
in the entry box.

So if you are surfing the Internet and see something you like, you copy
the URL you are in, fire Getleft up and you will see the URL ready for
downloading.

Opening the combobox, you have the last urls you entered available.
If you only remember to select the url after you see the dialog, select
it and the right-click on the entry.

After you give the URL to download, you will be asked where to store
the bounty, you have to give an existing directory or where to create 
a new one.

Then the first file is downloaded and processed, after that the program
shows a dialog box with all the links in that first page, you get to choose
which ones to follow and which will be ignored. If you right-click on this
dialog, a pop-up menu will appear to make it simpler to choose.

You can also enter an url for a regular file, like an archive or whatever, in
this case this file will be downloaded in the directory you entered, ignoring
the directory where the file was in the remote site. You can also enter a ftp
url, but Getleft will not download it recursively, at least not yet.

Then the real downloading will begin, link after link in quite a boring fashion.
The dialog which shows the downloading of a file shows a button where you can
pause and resume the download, you can also skip the current file being
downloaded.

There is an error log, a file called 'geterror.log'. If any error is detected
during the download, a window will appear at the end of the downloading to
show you this error log.

Site Map

Use this command to get a map from the Web site. To begin with, Getleft will
download all the html files and active content pages in the site and then
present you with a dialog box in which you can choose which files will be
downloaded.


Stop

After this page: stops after downloading all links in the current page
After this file: stops after downloading the current file.
Now: It will stop after you confirm.

Pause

After this page: pauses after downloading all links in the current page
After this file: pauses after downloading the current file.

To resume, uncheck the option in the menu.

Exit

Well, it exits the program.


Options menu


Up links

Choose whether you want to follow, default, or ignore the links to pages
that are above the current one in the site directory structure.


Levels

The levels of links you want the program to follow.

The default is 'no limit', '0' will download a page but no links, not even
the graphics, etc.

External links

Here you can choose whether Getleft should follow links outside the domain
of the url you entered.

Filter Files

Only Html

if you check this option, only html and active content pages will be downloaded

Images

When a thumbnail is given as the link for an image, you can choose to donwload
only the thumbnail, or only the linked image.

Choose filter

A dialog box appears in which you can choose which types of files you do not
want to download

Resume

Sometimes a file download gets interrupted, check this option is you want 
'Getleft' to continue where it left.

Update

Usually, Getleft does not download the files that are already in your hard
disk, check this option is you want Getleft to download them if there are
newer versions in the server.

CGI

If you want to follow links that go through CGI scripts, check this option.
The program is not that intelligent, it only identifies CGI scripts if the
link includes a '?', to pass parameters.


Use proxy

Check this option if you need to use a proxy to access the Internet.

Check size

Getleft checks the size of the files it has just downloaded with the size
reported by the server, this way it knows whether the file is in fact complete,
and if it isn't, it can resume the download. Unfortunately, some servers, I
don't know why, report wrong sizes, which makes Getleft download the file
again and again, without ever reaching the size reported by the server.

So, if you are trying to download a site, and Getleft is stuck trying to 
download a file, uncheck this option. If it still doesn't work, e-mail me the
url.


Tools menu


Purge files

This option allows to recursively scan through a directory tree, deleting 
files that match a certain pattern and substituting them with empty files with
the same name.

This is useful is a site takes more than a session to download. For example,
imagine you use two computers, one with a fast, reliable or at least, cheap
Internet connection, like the computer at work or at college, and another one,
your very own, without it. You can download the sites in the first one and 
take them to the second, in floppies, Zips,... With this option you can free
space in the first computer and Getleft, seeing that the file already exists,
won't try to download it again.


Restore orig

As the files get changed to keep only relative paths, the original ones are
kept with an '.orig' extension. This command deletes the new files and
renames the original ones to their real name.

This is useful, is you actually want a mirror a site. This program, though, is
not a good tool for mirroring sites.



Configure proxy

A dialog box appears in which you can enter the address of your proxy. You
can check the address in the configuration of your browser, failing that, you
can ask your network administrator.


Languages

You get to choose with language the program will use, at present only English,
French, Polish, Turkish, Korean and Spanish are supported. You will need to
have proper fonts installed for them to show correctly.

In the 'Languages' directory you will find many 'menus.xx' files with the
different translations, you can delete those you don't need, but always
keep the English one 'menus.en', if you don't want even more error messages to
appear.

If you want Getleft to support your own language, you will have to translate
it for me, look for a file called 'Translating' in the distribution.



Help Menu


Manual

Shows this text


License

Shows the GNU license, basically you can do whatever you want with the program
except claim that you wrote it yourself and change the license.

It should also be very clear that the program comes with NO WARRANTY
whatsoever. In fact, I would be very surprised if it happened to work at all.


About

Shows some info about the program.


Credits


Author


Loath as I am to admit it, this is the brainchild of Andres Garcia,
you can contact me:

andresgarci@retemail.es

or by snail-mail

Andrs Garca
Las Mestas 7, 3-3B
Gijn 33204
Asturias
Spain

You may send all kinds of bug reports, Urls that don't work, patches
or even lots of money.

Contributors

I would like to mention Brent Welch, even though he probably has never heard
of Getleft, his book "Practical Progamming in Tcl and Tk" has help me a real lot,
in fact some of his tricks can be found all around Getleft (the spaghetti code
is all mine though).

Translators

- Polish: ooshy <ooshy@poczta.onet.pl>

- French: Eric Seigne <erics@mail.dotcom.fr>

- Turkish: Fisek Trke Grubu <linux-turkce@fisek.com.tr>

- Korean: Kim SeungBaeck <kongsi@taegu.linux.or.kr>


And last but not least bug reports from:

- LogicX <LogicX@LogicX.org>

- Miguel <onxmiguel@ornalux.es>

- Mike McCaffery <mike.mccaffery@barclays.co.uk>
