%% DocBy.\TeX{} -- Making a Documentation Of Source Codes By TeX \def\projectversion{\dbtversion} \def\headtitle{DocBy.\TeX} \showboxbreadth=1500 \showboxdepth=2 \hfuzz=4pt \input docby.tex \title DocBy.\TeX{} -- Making a Documentation Of Sources By \TeX{} \author Petr Ol\v s\'ak \centerline{\ulink[http://www.olsak.net/docbytex.html]% {www.olsak.net/docbytex.html}} \def\db{\dg\nb} \def\du#1{\api{\nb#1}} \let\quotehook=\langleactive \def\insdef#1 {\ifirst{docby.tex}{def\nb#1 }{^^B\cbrace}{++}} \def\inssdef#1 {\ifirst{docby.tex}{def\nb#1}{\empty}{+-}} \bgroup \catcode`\[=1 \catcode`]=2 \catcode`\{=12 \catcode`\}=12 \gdef\obrace[{] \gdef\cbrace[}] \egroup \def\indexhook{The control sequences marked by ($\succ$) are sequences at user level. Other control sequences are internal in DocBy.\TeX. The bold page number points to the place where the sequence is defined and documented, other page numbers point to occurrence of the sequence. The control sequences for users have underlined pagenumber in the list of page numbers. This means the page where the sequence is documented at user level. \medskip} \def\nn#1 {\noactive{\nb#1}} \nn insert \nn undefined \def\cnvbookmark#1{\lowercase{\lowercase{#1}}} \def\bookmarkshook{\lo ìe \lo ¹s \lo èc \lo ør \lo ¾z \lo ýy \lo áa \lo íi \lo ée \lo úu \lo ùu \lo óo \lo òn } \def\lo #1#2{\lccode`#1=`#2} \dotoc \bookmarks \sec Preface %%%%%%%%% DocBy.\TeX{} gives you a possibility to creating a documentation of source codes by \TeX{}. The source codes can be i C language or whatever other computer language. On the contrast of Knuth's ``literal programming'' this tool does not use any preprocessors for doing filters of information for human and for computer which is stored in single source file. I suppose that programmers prefer to write and tune the program in computer language before they start to write the documentation. It would be fine to write the documentation after that and without modifying of the source code of the working program. Modern systems gives possibility to open more windows with more than one text editors: you can see the source code in one editor and write the documentation of it in second. Now, there is no need to merge both information (for computer and for human being) to single file. The first part of this document (\cite[uziv]) describes the \docbytex{} at user level. The next part documents the implicit macros implemented in \docbytex{}, which are supposed that experienced user will want to change them in order to realize special wishes. The next section~\cite[design] includes the documentation of design-like macros. User can change them to create a better look of his/her document. The last section~\cite[implementace] describes all macros of \docbytex{} at implementation level in detail. This document is created by \docbytex{} itself, it means that it can serve as an example of \docbytex{} usage. \sec [uziv] For Users %%%%%%%%%%%%%%%%%%%%% \subsec [cleneni] File Types %%%%%%%%%%%%%%%%%%%%%%%%%%%% The \docbytex{} is proposed as a tool for making documentation of C language. That is a reason why the next example is a documentation of the hypothetical program written in this language. If you needs to document another computer language, you can change some macros (see the section~\cite[zmeny]). Wee suppose that the source code is separated into ``modules''. Each module is intended to one special problem which is solved by programmer. Each module has its own name ("foo" for example) and it is written in files "foo.h" and "foo.c". These files are compiled into "foo.o". All modules are linked at the end of compilation into the executable program. If we want to document these source files, we create new file with ".d" extension for each module, for example "foo.d". The documentation of the module will be written in that file. Next we create the main file (for example "program.tex") where all "*.d" files are included by the command "\module"\du{module}. You can use commands "\title" (name of the program), "\author" (name of the author) and (for example) "\dotoc" for making of table of contents, "\doindex" for generating of the index. Of course, you can write first or general notes to the program in the main file too. The contents of the file "program.tex" can be: \begtt \input docby.tex \title The Program lup -- Documentation of The Source Codes \author Progr and Ammer \dotoc % the table of contents will be here \sec The structure of the source files The source files are in the three modules. The auxiliary functions are defined in "base.c" and "base.h" files. The window management are solved in "win.c" and "win.h" files. The file "main.c" includes the function "main". \module base \module win \module main \doindex % the index will be created here \bye \endtt We decided to sort the documentation from ``simple'' functions to the more complicated problems. Somebody can prefer another way from "main" function first and the auxiliary functions at the end. He/she can write: \begtt \module main \module win \module base \doindex \bye \endtt Both ways are possible because the documentation is hyperlinked automatically. When the reader see the usage of some function, he/she can simply go to the definition of this function simply by one click. The reverse hyperlinks are included too. \subsec [priklad] An Example of the Module Documentation %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Let we document the module "foo" in the file "foo.d". This file is included by "module"~"foo" command. We can document any part of source "foo.c" by words and combine this by a listing of parts of source "foo.c" or "foo.h" by command "\ins"~"c "\du{ins} or "\ins"~"h ". The part of the source code is declared usually by "//: " line. The example follows. Suppose that the following text is written in the file "foo.d" \begtt The struct \dg [struct] mypair is used as a return value of "my_special_function". There are two "float" values. \ins c mypair The \dg [struct mypair] my_special_function() has one parameter "p" and returns double and triple of this parameter in "mypair" struct. \ins c my_special_function \endtt The file "foo.c" has to include the comments "//: "{\tt mypair} and "//: "{\tt my\_special\_function}. These comments delimit the part of source code to be listed in the documentation: \begtt #include //: mypair struct mypair { float x, y; }; //: my_special_function struct my_special_function (float p) { struct mypair return_pair; return_pair.x = 2*p; // double of p return_pair.y = 3*p; // triple of p return return_pair; } \endtt The result looks like that: \bigskip The struct \dg [struct] mypair is used as a return value of "my_special_function". There are two "float" values. \def\modulename{foo} \ins c mypair The \dg [struct mypair] my_special_function() has one parameter "p" and returns double and triple of this parameter in "mypair" struct. \ins c my_special_function The first listed part of source code is started by "//: "{\tt mypair} and ended by firs occurrence of the "//:". The second listed part is started by "//: "{\tt my\_special\_function} and ended at the end of file. These delimiters (and the neighbouring empty lines) are not printed. The order of the listed parts are independent of the order in source file. We can first comment my special function and include its source code. Afterward we can explain the structure mypair and show the source code of this structure. Notice that the numbers of lines are inserted exactly by the lines in source code. It means that the missing line "#include "{\tt} has number one and first printed line has the number five. The "//: " delimiter and the closing delimiter "//:" can be at arbitrary place of the line, no essential at begin of line. The lines with the delimiters are not printed. Notice the command "\dg" in source of the documentation. The documented word (separated by space) follows immediately. The optional parameter in brackets is interpreted as ``type'' of the documented word. The documented word is printed in red color on the rectangle and all occurrences of that word in the documentation is printed in blue color and treated as hyperlink to the place where is the word documented (red color). The occurrence of that word have to be written between the quotes {\tt\char`\"...\char`\"} or it is placed in the inserted source code. You need not do any marks in source code in order to highlight the usage of the documented word. This is done automatically. If the documented word has the brackets "()" at the end, then it is the function. These brackets are not printed in the current place, but they are printed in the footnotes and in the index. The quotes {\tt\char`\"...\char`\"} are delimiters of ``parts of listings inside paragraph''. This text is printed by typewriter font and the occurrences of documented words are hyperlinked here. All characters have printed here without re-interpretation, it means this environment behaves like ``verbatim''. The footnote includes a list of all documented words on the current page. Each word is followed by list of pages here. These pages points to all pages here the documented word occurs. All documented words are automatically inserted to the alphabetical index created by "\doindex" command. \subsec What Version of \TeX{} for \docbytex{}? %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% In order to activate all features mentioned above we need to use pdf\TeX{} extended by enc\TeX{}. The language of automatically generated words (such as Contents, Index) is selected by current value of "\language" register when "\input docby.tex" is processed. \docbytex{} writes on the terminal the ``modes'' information: \def\begtthook{\catcode`\!=0 \sfcode`.=1000 } \begtt This is DocBy.TeX, version !dbtversion, modes: enc+PDF+ENG \endtt \def\begtthook{} \docbytex{} can work in the following modes: "enc/NOenc", "PDF/DVI", "ENG/CS". The "enc"\api{enc} mode is activated if the enc\TeX{} is detected. Otherwise (if enc\TeX{} is unavailable), \docbytex{} prints warning and sets the "NOenc"\api{NOenc} mode: the occurrences of documented words are not detected and hyperlinked. The index is much more poor, because the pages with occurrences of the words are missing. Only the places of documentation of the words are referred. It means that the enc\TeX{} extension is very important for \docbytex. This extension is usually available in current \TeX{} distributions and it is activated by "pdfcsplain" format. So the recommendation is: use "pdfcsplain" when you are using \docbytex. The "PDF"\api{PDF} mode is activated if the pdf\TeX{} is used. Otherwise \docbytex{} switches to the "DVI"\api{DVI} mode and prints the warning message on the terminal. The colors and hyperlinks are not working in "DVI" mode but the list of pages with all occurrences of documented words is printed in index (if enc\TeX{} is activated). If "\language=0" or "(pdf)csplain" isn't used then language mode is set to ENG (English words will be generated). Else this mode is set to CS (Czech words will be generated). If you are using another language, you need to redefine some macros, see section~\cite[nazvy]. \subsec Searching Words by Enc\TeX{} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% The hyperlinked words are located by enc\TeX{} by ``hungry algorithm''. It means that if there are two documented words "abc" and "abcde" then the text "abcdefg" is divided to the hyperlinked part "abcde" (the blue color is used) and to the normal part "fg" (black color). The hyperlinked part points to the place of the documentation of the word "abcde". On the other hand the text "abcdx" is divided to hyperlinked part "abc" and this part points to the documentation of the word "abc". Enc\TeX{} is not able to work with regular expositions. It means that there is no simple possibility to search only words bounded by spaces, other white characters or by punctuation. Enc\TeX{} searches the word as a part of another word. This leads to unexpected situations: the short word is documented but it is a part of longer undocumented words used in source code. For example, you document the structure "turn" but you don't need to hyperlink the part of the word "return". In such case you can define the "return" word as a ``normal'' undocumented word by the command "\noactive{}"\du{noactive} (for example "\noactive{return}"). This command declares the "" as a searched word (for enc\TeX) but sets it as inactive. Imagine that you document a word which is used in code in ``documented meaning'' only if some text precedes this word and/or some text followed the word. If the word is used with another prefix/postfix then this is undocumented meaning of the word. You can use in such case a declaration "\onlyactive{}{}{}"\du{onlyactive}. If you declare the word by "\dg " (or by similar manner, see section~\cite[ddsl]), then the word is hyperlinked in source code only if the text "" precedes and the text "" follows. The text "" and/or "" itself stays inactive. The parameters "" or "" can be empty (no both simultaneously) and you can use more "\onlyactive" declarations of single "". \docbytex{} activates the enc\TeX{} searching only inside the group {\tt\char`\"...\char`\"} or in listings of source codes. It means that "\mubytein=1" (see enc\TeX{} documentation) is set only in these situations. We recommend to leave "\mubytein=0" outside these environment. If you set "\mubytein=1" (for example because of UTF-8 encoding) for the whole document then you do it on your own risk. The words inside your comments can be hyperlinked in such case. \subsec The Index, Table of Contents, Footnotes and Bookmarks Generation %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% The index and table of contents generation is fully done on macro level of \docbytex{}. You needn't use any external program (\docbytex{} itself does the alphabetical sorting). Just write "\doindex"\du{doindex} or "\dotoc"\du{dotoc} on the desired place in your document. Warning: the table of contents is not correctly generated after first pass of \TeX. You have to run \TeX{} twice. The pages may be changed after second pass because of table of contents is inserted. Thus correct oputput is (may be) guaranteed after third pass of \TeX. The words ``may be'' are written here due to the problem with footnotes mentioned in section~\cite[specfootnote]. The footnotes are changed in all three \TeX{} runs and this influences the vertical typesetting retrospectively. This is a reason why \docbytex{} performs the check of consistency of references generated by current and previous \TeX{} pass. This check is done during the "\bye"\du{bye} macro is processing. Thus, it is usable to write "\bye" command instead "\end" primitive command at the end of the document. If the "\bye" macro is used then you can see the message ``{\tt OK, all references are consistent}'' on the terminal or the warning ``{\tt page references are inconsistent, run me again}''. You can do test of consistency in more detail by following script: \begtt #!/bin/bash cp document.ref document.r0 pdfcsplain document diff document.r0 document.ref \endtt \docbytex{} tries to fix the footnote processing after second pass in order to document convergence. If you do big changes in the document after that then \docbytex{} does change the numbers of lines for footnotes and the Overfull/Underfull boxes may occur. We recommend to remove the {\tt.ref} file and to run three passes of \docbytex{} again in such case. \docbytex{} creates the structured bookmarks in PDF output if "\bookmarks"\du{bookmarks} command is used. The structured bookmarks include names of parts, sections, subsections and documented words. There is no matter where the command "\bookmarks" is written because the information used in bookmarks is read from {\tt.ref} file. The problem about encoding of texts of bookmarks is discussed in section~\cite[hooky]. \subsec [vkladani] Source Code Inserting %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Instead of simply command "\ins" you can use two more elaborate commands "\ifirst"\du{ifirst} and "\inext"\du{inext} in order to insert a part of source code in your documentation. The "\ifirst{}{}{}{}" command inserts a part of the file "" (full file name including extension) from first line with the pattern "" ending by line with the pattern "" or (if such line does not exists) to the end of file. If the pattern "" does not exists then the warning is printed on the terminal. The parameters of "\ifirst" command are first expanded and used thereafter. The active tie character is expanded to the space. The parameter "" specifies if the line with "" pattern and/or the line with "" pattern have to be printed or not. This parameter has only two characters (plus and/or minus) with the following meaning: \begtt why: -- don't print first nor ending line why: +- print first line but don't print ending line why: -+ don't print first line but print ending line why: ++ print both lines \endtt If the parameter "" is empty (use "{}" notation) then the printing starts on the begin of file. If the parameter "" is empty, only one line is printed. If "=\end",\du{end} then printing stops at the end of file. The ending line does not exists in such case. If the parameter "" (or "" respectively) has "\empty"\du{empty} value (use "{\empty}" notation) then the printing starts (or stops respectively) at the first empty line. You can specify if this line is printed by "" parameter. The parameters "" and "" can be started by "^^B" character (it means that the pattern have to be at the begin of the line) and/or they can be ended by "^^E" character (it means that the pattern have to be at the end of line). For example the parameter "^^Btext^^E" means that "text" have to be on the line without nothing more. The special \TeX{} characters (special categories) are not allowed in "" and "" parameters. You have to use special control sequences "\nb"\du{nb}, "\obrace"\du{obrace}, "\cbrace"\du{cbrace}, "\percent"\du{percent} and "\inchquote"\du{inchquote} instead of "\", "{", "}", "%", {\tt\char`\"} characters. You can define aditional sequences for another special \TeX{} characters, for example: \begtt {\catcode`\#=12 \gdef\hashmark{#}} \endtt If parameters "" and "" are the same or the "" pattern is on the same line as "" pattern then only this line is printed ("" have to be "++" or "+-"). If this condition is true but "" is "-+" or "--", then the printing of the code is stopped at next line with "" pattern or at the end of the file. The "\ifirst" command remembers the name of the included file and the number of the last line which was read. Next time you can use the command "\inext{}{}{}". This command starts the searching of the "" pattern from the first line which wasn't read by the previous "\ifirst" or "\inext" command. The parameters of the "\inext" command have the same meaning as the parameters of the "\ifirst" command. The parameter "" is missing because the "" from the last "\ifirst" command is used. The number of the last line read by "\ifirst" or "\inext" command is stored in "\lineno"\du{lineno} register (no matter if this line was printed or no). If the printing of code was stopped at the end of the file then "\lineno" equals to the number of lines of the file. You can do test of reaching of the end of file by "\ifeof\infile". Examples: \begtt \ifirst {file.txt}{foo}{foo}{++} % print the first line % with the text "foo" \inext {foo}{foo}{++} % print the next line with % the occurence of "foo" \ifirst {file.c}{//: from}{//:}{--} % the same as \ins command \ifirst {file.h}{func(}{)}{++} % print of function prototype \ifirst {file.c}{func(}{^^B\cbrace}{++} % print of the code func \ifirst {file.txt}{}{\end}{++} % print of the whole file \ifirst {file.txt}{}{\empty}{+-} % print of the first block % separated by empty line \endtt If the first line of the code to be printed is empty then it is not printed. If the last line of the code to be printed is empty, it is not printed too. This is an implicit behavior. But if you write "\skippingfalse",\du{skippingfalse} then this behavior is switched off. It means that the empty lines can occur at the begin or at the end of listings. You can use "\skippingtrue"\du{skippingtrue} in order to return to the implicit behavior. The parameter "" and "" can have the prefix in the form "\count= ".\du{count} The value of the " - 1" means how many occurrences of the pattern have to be skipped and ignored during searching. The ""-th occurrence of the pattern is only significant. For example "{\count=3 foo}" means that two occurrences of "foo" have to be skipped and the third occurrence points to the right place, where the printing of the code starts (or ends). If the prefix "\count= " is missing then \docbytex{} supposes that "\count=1". If the parameters "" or "" are empty and "\count=" is used then the space after "" needn't be written and the meaning is slightly different: If the "" parameter is empty then "\count" means the number of line from where the printing is started. If the parameter "" is empty then "\count" means the number of printed lines. The previous sentences are true for "=++" and for "\skippingfalse". If the "" parameter have different value and/or "\skipingtrue" then you must add/subtract one or two to/from the line number/number of lines. Examples: \begtt \skippingfalse \ifirst {file.txt}{\count=20}{\count=10}{++} % print from line 20 to 29 \ifirst {file.txt}{}{\count=2 \empty}{+-} % print to the second empty line \ifirst {file.txt}{\count=50}{\end}{++} % print from 50th line to the end \ifirst {file.tex}{\count=5 \nb section}{\count=2 \nb section}{+-} % print fifth section from TeX source \endtt \subsec [lineodkazy] References to Line Numbers %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% The command "\cite[