Copyright © 1995 Eugene W. Stark.See here for information on copying. Versions 2.4 and higher are shareware. Please read here to find out how to register your copy.
Copyright © 1995 Eugene W. Stark. All rights reserved.
Copyright © 1995 Eugene W. Stark.Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
This product includes software developed by Eugene W. Stark.
THIS SOFTWARE IS PROVIDED BY EUGENE W. STARK (THE AUTHOR) ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Copyright © 1995 Eugene W. Stark. All rights reserved.
To pay the registration fee, please send a check or money order for $20 US currency to:
Eugene W. StarkWhen I receive your registration fee, I will send you an acknowledgement of your registration, and place you on a mailing list to receive announcements of future updates to the program. Your registration fee entitles you to use any updated version of the program I release within two years of receiving your fee. Your registration fee helps me to justify spending time improving this program and responding to E-mail from users. Thanks!
14 Landing Lane
Port Jefferson, NY 11777
USA
Copyright © 1995 Eugene W. Stark. All rights reserved.
I originally wrote this program because I tried the GEDCOM to HTML translator posted by Frode Kvam (frode@ifi.unit.no), and found it insufficiently flexible. Since his program only parsed a limited portion of the GEDCOM file, not including notes records, there wasn't an easy way to modify it to get all my notes into the output files. So, I decided to write a YACC-based parser for the GEDCOM standard, and to base the translator on that. The YACC parser was used in Version 1 of my program, however as I got more experience with the GEDCOM standard and how it is actually used in practice, I decided that it was too difficult to make the YACC-based parser accept the full variety of GEDCOM's that actually exist. So, for Version 2 I rewrote the parser so that it will accept essentially ``any'' GEDCOM file, and will only complain about grossly malformed input.
Since Version 2.0 many people have used this program to place their family history databases on the World Wide Web. Small GEDCOM's of under 1000 individuals are processed into HTML by GED2HTML in a few seconds on a modern PC running Un*x (processing is somewhat slower under Windows due to the more inefficient filesystem). However, on a system with sufficient swap space and main memory, much larger GEDCOM's can be processed. The program has processed databases of well over 100,000 lines of GEDCOM and 10,000 individuals under both Un*x and Windows. The program is capable of processing all the GEDCOM's on Yvon Cyr's Acadian/French Canadian CD-ROM. The largest database I have attempted is the file ``t-roux.ged'' on that CD-ROM, which is a 5478458 byte, 214266 line GEDCOM file containing 15472 individuals and 7012 families. On my system (486/33 with 16MB RAM and IDE disks, running the FreeBSD 2.0.5 operating system), it took roughly 35 minutes to process this file, of which under five minutes were spent reading the file and constructing the database, and the remainder was spent in outputting 1548 HTML files of individual data, 10 individuals per file, organized into 31 directories, a three-level hierarchical index consisting of 574 HTML files, and a surname index in a single HTML file. The HTML output files consumed 18738K of disk space. The processing itself required 32MB of virtual memory.
I have used this program to prepare my own data for presentation on the World-Wide Web. You can view this data by starting from here. I preprocessed my GEDCOM file to produce approximately 700 individual files, which are linked together between themselves and to my hypertext family history document. Birger Wathne (Birger.Wathne@vest.sdata.no) and others have used various versions of this program in various demonstrations of genealogy over the World-Wide Web. Some of these demonstrations do not preprocess the data into HTML files, but rather use LifeLines to manage the database in GEDCOM format, and ged2html to process the output of queries for presentation over the Web. However, at present most people are using this program as a ``black box'' for quickly transforming their GEDCOM data into a form suitable for presentation on the World Wide Web. A good starting point for finding many of these databases is Tim Doyle's home page.
I have developed and run this program on an Intel 486DX/33 under the FreeBSD operating system. If you are using another flavor of Un*x, you shouldn't have too much trouble getting it to run. You do need an ANSI C compiler (like GCC), as I am no longer interested in writing old-style C. I have also compiled the program for Windows using Microsoft Visual C/C++ 1.0. Most of the people presently using the program are using the Windows version.
The GEDCOM parser in the program is built around the GEDCOM 5.3 standard. Whereas version 1 of this program checked the GEDCOM input fairly stringently for conformance to the standard, the current version attempts to make sense out of anything that looks remotely like a GEDCOM file. It will complain about grossly malformed GEDCOM files, but it still tries to get through to the end and produce whatever output it can.
The output processor is template-driven. That is, it consists of an interpreter for a simple macro language, which produces output files by processing template strings and filling in information from the GEDCOM database. The template-driven output scheme was used to obtain flexibility and language independence. The default templates use the cross-reference ID's in the GEDCOM file to name the HTML files, and will insert one ``image'' file (if it exists) near the beginning of each individual file and one ``additional information'' file (if it exists) at the end of each individual file. For example, an individual with cross-reference ID ``I101'' would receive an HTML file ``I101.html''. As this file is created, the file ``I101.img'' (intended to be used to insert an image of the person) would be inserted near the beginning, and the file ``I101.inc'' would be inserted at the end (intended to be used to insert arbitrary additional material). Default templates are compiled into the program, and they will be used unless you specify an alternative template using the appropriate command-line argument.
Though versions 2.3a and earlier of this program were released as freeware, since Version 2.3a I have started to spend quite a bit of time responding to E-mail from users of the program. From the E-mail correspondence, it became clear that much better documentation was required. Also, the program itself has grown, and revisions are starting to take more time. In addition, I have begun running an Experimental GenWeb Index site on the World Wide Web to try to establish a central index of as much of the data that was prepared using GED2HTML program (and other compatible software) as I could. To justify the amount of time I am spending on this work, I decided to make the current and future versions of the program shareware.
THANKS: go to Birger Wathne for contributing useful ideas and code for the first versions of this program, and to a number of other users (including, but not limited to, Annelise Anderson, Allyn Brosz, Susie and Kerry Jane Dunavant, Bob Fieg, W. Wesley Groleau, Brian Mavrogeorge, Steve Messinger, Mike Schwitzgebel, and Doug Smith) of various versions of the program who took the trouble to send me their bug reports and problematic GEDCOMS as well everyone else who sent kind words that make all the work I did on this program worthwhile.
Copyright © 1995 Eugene W. Stark. All rights reserved.
Copyright © 1995 Eugene W. Stark. All rights reserved.
You will always be able to find your way to the current version of the program by starting here. The links below, though current at the time this was written, might become outdated due to network reconfiguration underway at our site.
If you are using Windows, you may also need or want the following utilities.
Copyright © 1995 Eugene W. Stark. All rights reserved.
Copyright © 1995 Eugene W. Stark. All rights reserved.
After downloading, you should have a file ``ged2html-vvvv.tar.gz'', where the ``vvvv'' represents the version number of the distribution. First, uncompress it using the command:
At this point, you should have a number of files, including ``Makefile'', and C sources and header files. Compile the program by typing:
Ged2html is invoked, as are all Un*x commands, by typing its name, possibly followed by command-line arguments. Ged2html can receive its GEDCOM input in one of two ways. If a series of filename arguments are supplied on the command line, then it reads GEDCOM from them, one after the other. If it is invoked without filename arguments on the command line, then it reads GEDCOM from the standard input.
Copyright © 1995 Eugene W. Stark. All rights reserved.
Copyright © 1995 Eugene W. Stark. All rights reserved.
To make more complicated changes to the output format, you will need to supply ``command-line options'' to the program when it is run, you will need to supply command-line options to the program when it is run. This is done in the usual Un*x fashion. The available options are described here. The HTML output of the program can also be customized through the use of output templates, which are described here.
Copyright © 1995 Eugene W. Stark. All rights reserved.
Q: When I link the program, I get a message ``Undefined symbol _fgetln''. What is wrong? [Answer]
Q: When I link the program, I get a message ``Undefined symbol _strdup''. What is wrong? [Answer]
Q: When compiling tags.c, the compiler complains about incompatible types at the line containing ``bsearch()''. What should I do? [Answer]
Q: When compiling main.c, the compiler complains about SEEK_END being undefined. What should I do? [Answer]
Copyright © 1995 Eugene W. Stark. All rights reserved.
A: You are trying to compile the program with a pre-ANSI C compiler. It cannot understand ANSI function prototypes. Get an ANSI C compiler, such as gcc, and use that.
Q: When I link the program, I get a message ``Undefined symbol _fgetln''. What is wrong?
A: You are running on an older or non-BSD version of Un*x that does not have the ``fgetln()'' function. I actually wrote my own implementation of fgetln() and put it in the code, but it is #ifdef'ed out because I would rather use a library version if it exists. Edit the Makefile and delete the ``-DHAVE_FGETLN'' from the ``CFLAGS'' line. Then re-run ``make''.
Q: When I link the program, I get a message ``Undefined symbol _strdup''. What is wrong?
A: You are running on a version of Un*x that does not have the ``strdup()'' function. There is code for it, but it is #ifdef'ed out because I would rather use a library version if it exists. Edit the Makefile and delete the ``-DHAVE_STRDUP'' from the ``CFLAGS'' line. Then re-run ``make''.
Q: When compiling tags.c, the compiler complains about incompatible types at the line containing ``bsearch()''. What should I do?
A:
It may be necessary to place the cast (struct tag *)
in front
of the call to bsearch()
in tags.c, or possibly to change
the cast on the function argument
to bsearch to make the compiler happy.
This is not necessary with gcc, and it is my belief that it should not
be necessary under ANSI C.
Q: When compiling main.c, the compiler complains about SEEK_END being undefined. What should I do?
A: Some people have found that they need to #include <unistd.h> in main.c. My belief is that ANSI C dictates that the constant SEEK_END should be in <stdlib.h>. If you are having this problem, try #include'ing <unistd.h> too.
Copyright © 1995 Eugene W. Stark. All rights reserved.
Copyright © 1995 Eugene W. Stark. All rights reserved.
After downloading, you should have a file GED2HTML.ZIP. Make a subdirectory GED2HTML (call it whatever you want) in which to unpack the program. Go to that directory and execute:
Copyright © 1995 Eugene W. Stark. All rights reserved.
Copyright © 1995 Eugene W. Stark. All rights reserved.
To make more complicated changes to the output format, you will need to supply optional configuration parameters to the program when it is run. Most of these parameters are accessible from GED2HTML.EXE, the Windows user interface. However, for the few parameters that are not accessible through the user interface, you will need to run G2H.EXE directly, and supply it with ``command-line options.'' This is done as follows:
Copyright © 1995 Eugene W. Stark. All rights reserved.
Q: Everything looks fine on my PC, but when I upload the files to my Web service provider, all the HTML files have suffix ``.htm'' and Netscape doesn't seem to interpret them as HTML files. [Answer]
Q: How can I create all the output files so that their names end in ``.html'', rather than ``.htm''. [Answer]
Q: How can I cause all the HTML files to be generated so that the links in all the individual files specify filenames ending in ``.html'', rather than ``.htm''? [Answer]
Q: After I upload all the HTML files to my Web service provider, none of the links work, because the file names are all in lower case, but the links specify the file names in upper case. [Answer]
Copyright © 1995 Eugene W. Stark. All rights reserved.
A: The current version of GED2HTML.EXE does not have a true Windows front-end, and needs to be launched from the File Manager. See here for the simplest way to do it.
Q: Everything looks fine on my PC, but when I upload the files to my Web service provider, all the HTML files have suffix ``.htm'' and Netscape doesn't seem to interpret them as HTML files.
A: This is a somewhat technical problem. The easiest solution requires the cooperation of your Web service provider. Web servers generally have a ``mime.types'' configuration file, which lists mappings from filename suffixes to MIME types. This information allows the server to determine what kind of information is in a file by looking at the suffix. The server communicates this information to your browser when the file is retrieved, and the browser, in turn, uses the information to control how the file is displayed. Some servers lack an entry:
Q: How can I create all the output files so that their names end in ``.html'', rather than ``.htm''.
A: You can't, at least not on pre-Windows 95 systems. The reason is that DOS only supports three-letter filename suffixes, and will truncate anything longer than that. It might be possible to have longer suffixes on Windows 95. I don't know because I don't have Windows 95. Try using the ``-f'' option to GED2HTML.EXE to specify an alternate filename template string. See here for more details on how to supply options.
Q: How can I cause all the HTML files to be generated so that the links in all the individual files specify filenames ending in ``.html'', rather than ``.htm''?
A: To do this, you need to use the ``-t'', ``-T'', and ``-S'' options to GED2HTML.EXE to cause it to use the sample output template files INDIV.TPL, INDEX.TPL, and SURNAME.TPL, supplied with the distribution, rather than the compiled-in output templates. See here for more details on how to supply options.
Q: After I upload all the HTML files to my Web service provider, none of the links work, because the file names are all in lower case, but the links specify the file names in upper case.
A: You need to use different program that uploads your HTML files without changing their names to lower case. See here for a suggestion.
Copyright © 1995 Eugene W. Stark. All rights reserved.
Copyright © 1995 Eugene W. Stark. All rights reserved.
This file is the top-level file of the hierarchical index of individuals. It can be used as an entry point to the database. If you specified a ``flat'' index, then entries for all the individuals appear in this file in alphabetical order. Otherwise, this file contains links to sub-index containing ranges of names, which point ultimately to the individual data pages.
This file contains a list of all the surnames in the database, with links into the lowest-level index pages. This file can also be used as an entry point to the database.
This file contains all the source information from the database. Source links from the individual data pages point here.
This file contains a list of all the individuals in the database, together with birth and death dates and places, but no lineage or other information. It is output in a special format to enable it to be processed by automatic indexing programs. See here for more information on an experimental indexing service I am running.
These files, which are present if you elected to use a hierarchical index of individuals, contain index nodes below the top level.
These files contain the actual data on individuals. The number of individuals per file depends on the default settings and the command-line options you chose.
If you elected to use subdirectories, then these directories are created to hold the individual data files.
Copyright © 1995 Eugene W. Stark. All rights reserved.
If you run this program under Windows (as G2H.EXE, see here for details), you can supply command line options exactly as you would under Un*x. (Note that the program does not recognize the DOS conventions of control flags starting with ``/'' and parameters specified using ``:'' after the control flag.) If you run the program under Windows, then you don't actually have a "command line" on which to specify the parameters, as it is intended that Windows programs should pop up a dialog box to obtain their parameters. To make running the program easier and less confusing for most Windows users, I have supplied the front-end program GED2HTML.EXE whose purpose is simply to pop up a dialog box, collect parameters, and start G2H.EXE with the appropriate command line. See here for a more detailed description of how to launch the program under Windows.
Here are some examples of typical command lines and their effect on GED2HTML:
ged2html myged.ged
Run GED2HTML on the GEDCOM file ``myged.ged'', using the default options and creating the HTML files containing individual information in the current directory (or in subdirectories of the current directory). Also created are index files ``PERSONS.html'' and ``SURNAMES.html'' (``PERSONS.HTM'' and ``SURNAMES.HTM'' under Windows). If the number of individuals exceeds the default ``index width'' (currently 28), then ``PERSONS.html'' is the root of a hierarchical index organized into several files in such a way that the total number of entries in each file is less than or equal to the index width. Finally, a textual index file ``GENDEX.txt'' (``GENDEX.TXT'' under Windows) is created. This file contains one line for each individual and is output in a format suitable for processing by Unix tools such as AWK and SORT. It is intended for the use of network indexing software that utomatically collects pointers to individual data from a large number of sites and merge them into a single master index. See here for further information on an index I am maintaining.
ged2html -i myged.ged
Same as above, except that only the individual HTML pages are created, and not the ``PERSONS.html'', ``SURNAMES.html'' or ``GENDEX.txt'' files.
ged2html -c myged.ged
Automatic capitalization of surnames is disabled, so that surnames appear in the HTML output the same as they do in the original GEDCOM file.
ged2html -w 0 myged.ged
The ``index width'' is set to zero, which disables the production of a hierarchical index and arranges for all individual entries to be placed in the single ``PERSONS.html'' file.
ged2html -w 100 myged.ged
The ``index width'' is set to 100. This results in somewhat fewer index files and a shallower index hierarchy than the default setting.
ged2html -i -c -s I1001 I1002 -- myged.ged
This command specifies that no index files are to be generated, that automatic capitalization of surnames is disabled, and that HTML output files should be produced *only* for the individuals with cross reference ID's (XREF) I1001 and I1002 in the GEDCOM file. An arbitrarily long list of ID's can be specified after the ``-s'' flag, so that the ``--'' option is needed to indicate the end of option processing and that all remaining arguments are to be regarded as the names of GEDCOM files.
ged2html -d 100 myged.ged
Cause the individual HTML files to be placed in subdirectories, with a maximum of 100 files per subdirectory. The hyperlinks (URL's) placed in the HTML output files take the directory organization into account. This means that once you process a GEDCOM file and create a tree of directories and HTML files, you must preserve the organization of this tree if you want a WWW browser to be able to traverse the links between the files.
ged2html -d 0 myged.ged
Specifying ``-d 0'' disables the use of subdirectories, so that all individual HTML files are placed in the current directory.
Hopefully, you get the idea... Here is a list of all the options currently understood by the program:
Print a brief message listing the available options.
Print version number of program and copyright information.
The GEDCOM 5.3 draft standard standard specifies that continuation lines created using CONT are to be separated from the previous line by a newline, and that continuation lines created using CONC are not to have a newline. This flag forces a strict adherence to the standard. Strict adherence is not the default because I have found that most genealogy programs don't even output CONC; instead they use separate NOTE records when a line break is indicated. Thus, most people's output looks best when CONT and CONC are treated identically.
Disable automatic capitalization of surnames.
Specify number of individual files per subdirectory (0 means don't use subdirectories).
Specify a template string for the names of the HTML files (default '%s.html' or '%s.htm').
Force production of the ``GENDEX.txt'' textual index file (for use by automatic indexers). See here for more information on what this file is for.
Do not generate the ``PERSONS.html'' and ``SURNAMES.html'' index files containing entries for all the individuals and surnames in the input. See here for more information on these files.
Output files contain specified number of individuals (0 means don't put multiple individuals per file).
Include pedigree charts of the specified depth (0 means don't include any pedigree charts).
Limit the production of output files to a specified list of zero or more selected individuals.
Specify a template file for the surname index.
Specify a template file for individuals.
Specify a template file for the individual index.
See here for more information
on templates.
Create hierarchical index with nodes of specified maximum width (0 means put all individuals in one index file).
Copyright © 1995 Eugene W. Stark. All rights reserved.
The second place where HTML code is inserted verbatim is at the end of the information for each individual. At this point, the program looks for a file ``xxxxx.inc''. It is intended (though again not required), that this file would include any additional notes or information about this person not present in the original GEDCOM file. For example, I have used this in my hypertext family history document to link to things like wills and divorce records.
If you write your own output templates, you can arrange for inlining anything you want, using the same mechanism as is used by the default templates. For more information on output templates, see here.
Copyright © 1995 Eugene W. Stark. All rights reserved.
If you want to create your own templates, feel free to try it, but be forewarned that I have not made much of an attempt to make template programming ``idiot proof.'' Even I only modify the templates with a debugger close at hand. It is my expectation that most people will use the preloaded templates, and that if you want to make changes to the templates, you probably ought to be a programmer and you probably ought to read the output interpreter code in the file ``output.c'' (in the Un*x distribution) to see exactly what the interpreter can and cannot do. Otherwise, you can just play around making changes to the template files, using the code in the samples as a guide, but I make no guarantee that the program won't crash or otherwise behave strangely if you load in a malformed template.
If you are really convinced that you want to make your own templates, first have a look at the files ``indiv.tpl'', ``index.tpl'', and ``surname.tpl''. These use most, if not all, the available constructs in the output language. A template file consists of text interspersed with variable references and control commands. Variable references start with ``$'', and are used to insert in-line information from the GEDCOM database. Constructs that can appear in variables are as follows:
denotes the ``current object'' (individual, index, or source).
is a subscripting operation that selects the i-th family, event, note, etc. in a list. The identifier i is an ``index variable,'' which takes on values 1, 2, 3, etc.
is a selection operation that follows associations in the database. For example ${@.FATHER} denotes the individual record corresponding to the father of the current individual. You have to look at the sample template files and the code in output.c to see what selectors are understood.
is a selection operation that turns an individual record or index node into a URL to be output in an HTML anchor.
refers to the index variable i.
appearing in a variable name act as delimiters. They must be properly matched.
Control constructs are signalled by a "!" appearing at the beginning of a template line. Conditional execution uses the following control constructs:
The construction:
External HTML code can be inserted inline using the following construct:
The construct:
The construct:
I have organized the program so that it is language-independent, except for the tables in ``tags.c''. All strings in the output come either from the templates or from those tables. If you want to make the program produce output in another language, have a look at ``tags.h'' and ``tags.c'' to see what to do. You should also change the compilation flags in ``Makefile''. At the moment, only English is supported. If you create tables for another language, I'd appreciate receiving them so that I can integrate them back into the source. Thanks!
Please note that I have not made any specific attempt to support ``locales'' or international character sets. Except for converting surnames to upper-case (which can be disabled with a command-line option, see here), the program generally does not mutate input characters supplied to it, and I have at least seen German characters with umlauts, etc. passed properly by the program. I have also had feedback from someone who used an older version of the program to process Hebrew. It apparently worked for the most part, though there were a few problems that may have been caused by the program trying to capitalize surnames.
Copyright © 1995 Eugene W. Stark. All rights reserved.
Copyright © 1995 Eugene W. Stark. All rights reserved.