This weekend and the surrounding weekdays is PyCon! I've been neglecting a lot of open source work & maintenance that I used to be oh so familiar with, so I decided to take this time to cross off some easy tickets off my list.
When I type in man urxvt
, mentions of urxvtperl(3)
within it are only
half-bolded in my default pager. This caught my attention as a low-hanging
contribution to maybe tidy up their docs a slight amount.
urxvt
is kind of an obscure terminal emulator, they do things their own way &
I absolutely love them for that.
Their repository is hosted under cvs
.
Specifically, as outlined on http://software.schmorp.de/pkg/rxvt-unicode.html, you can "clone" the repository using:
cvs -z3 -d :pserver:anonymous@cvs.schmorp.de/schmorpforge co rxvt-unicode
I don't know how to use cvs so I opted to not do this :D
tinkered around with git a bit to figure out the git cvsimport
command:
git cvsimport -C urxvt -r cvs -k -v -d :pserver:anonymous@cvs.schmorp.de/schmorpforge rxvt-unicode
Discover this took longer than I had hoped, thanks to this stackoverflow post for spelling out the invocation for me: https://stackoverflow.com/a/11490134.
Also, running that command took roughly two hours so that was fun.
The offending manual file is located in doc/rxvt.1.man.in
. The .in
suffix
hints that there's some preprocessing going on before I see the final output,
but that's extraneous for our current goals.
Man pages are written in the troff
programming language, which looks pretty
esoteric. But finding what change to enact to get my goals was pretty easy:
- @@RXVT_NAME@@\fBperl\fR\|(3)
+ \fB@@RXVT_NAME@@Bperl\fR\|(3)
Just have to move the special \fB
control character to surround the entire
name, including prefix.
man
can be run on the file as-is, with wierd pre-preprocessed artifacts:
man doc/rxvt.1.man.in
But to build the files, first configure the entire project with a ./configure
in the base directory, then run make all
from within the doc/
directory
At some point I realized that there is a sibling doc/rxvt.1.pod
. A bit more
digging (from within the Makefile
) led me to find that the doc/*.man.in
files are generated from the *.pod
files:
%.tbl: %.pod
$(srcdir)/podtbl <$< >$@
%.1.man.in: %.1.tbl
$(POD2MAN) -s1 <$< >$@
%.3.man.in: %.3.tbl
$(POD2MAN) -s3 <$< >$@
%.7.man.in: %.7.tbl
$(POD2MAN) -s7 <$< >$@
I don't know too much about pod/tbl
, but upon initial search these look to be
perl-isms, pod
standing for "Plain Old Documentation".
The file is also much cleaner, with the generated @@RXVT_NAME@@\fBperl\fR\|(3)
stemming from this:
@@RXVT_NAME@@-extensions(1)
But the resulting question is: where does the formatting come from?
As part of this contribution, I wanted to make sure that I was following prior work with highlighting/distinguishing man page references from within this man page
Using the search (\d)
from within vim, here's a subset of what I found:
I<xterm>(1)
@@RXVT_NAME@@(7)
I<termcap(5)>
@@RXVT_NAME@@perl(3)
write(1)
B<xev>(1)
L<@@RXVT_NAME@@perl>(3)
In conclusion, I found no rhyme or reason and got more excited to maybe contribute some semblance of order to this obscure file.
pod2{man,html,xhtml,tbl(?)}
rxvt-unicode has a very well formatted manpage located here: http://pod.tst.eu/http://cvs.schmorp.de/rxvt-unicode/doc/rxvt.1.pod
Additionally, the doc/Makefile
has a %.html: %.tbl
rule, so we can build html files!
This is quite a boon because in investigating the man page inter-reference formatting issue, seeing what different output formats output can lead to hints!
Unfortunately, urxvt's pod to html converter uses pod2xhtml
, which doesn't
exist on my machine, nor is packaged on default gentoo.
A quick replacement of s/pod2xhtml/pod2html
gave me a quick working setup
though, and I continued down my path!
Unfortunately, pod2html
doesn't seem to auto-format the man links at all!
This signals to me that there's an under-specification of what these tokens
"are", and that the pod machinery could use some more hinting
L<>
I don't know how I found it, but I stumbled onto this addendum on a stackoverflow answer:
Btw, UNIX man pages work right out of docs:
L<crontab(5)>
This brings up http://man.he.net/man5/crontab
This is!! Exactly what I want! A properly documented way to link to man pages without implicit rules trying to auto-detect things!
I hastily surrounded some of the links I was working with, resulting in:
- @@RXVT_NAME@@perl(3)
+ \L<@@RXVT_NAME@@perl(3)>
and after running my makefile amalgimation make clean alldocclean alldoc rxvt.1.html all
(with modified s/pod2xhtml/pod2html
),
I got exactly what I was looking for, a properly formatted manpage reference
with linking included!
Kind of.
man_url_prefix
The resulting autogenerated man page reference URL directs to: http://man.he.net/man3/urxvtperl.
Which 404
s.
I'm not exactly sure what the bar to get a man page up on http://man.he.net
is, but apparently urxvt
doesn't make it.
There are plenty of other online man page providers that do include it though, a list:
There's no shortage of options. An alternative approach could be to figure out how to get man.he.net to index urxvt's man pages too, but that requires dealing with people & bureaucracy and I go down these rabbitholes to deal with software
Digging into pod2html
, it is somewhat configurable and lets you change the
website that man page references link to. This is done through man_url_prefix
"variable". There doesn't seem to be any way to modify these variables from the
command line instantiations, so this distraction has kind of led to a dead end.
I've been making heavy use of grep.app
recently, and plugging in
man_url_prefix
into there will result with 9 (nine) total uses of it
throughout all of github. I'm not sure this variable setting has actually been
used in any real capacity.
pod2xhtml
I quickly dug myself out of pod2html
hackery, since none of that actually
forwards me towards my goal since urxvt's doc builder doesn't actually use
pod2html
, it uses pod2xhtml
!
pod2xhtml
is much worse off.
It doesn't exist in gentoo's repo tree because it hasn't been touched in over a
decade (last update: 2010).
It uses a legacy link parser that
unfortunately dashes my hopes of improving the html generated manual along with
the man page -- it doesn't autolink to online man page references.
By default the L<>
wrapped man links just turn into this html:
<cite>urxvtperl</cite>(3)
Which, honestly. Isn't the worst. This mimics some manually crafted
I<xterm>(3)
s found within the page, so I consider it an acceptable
modification.
A consideration to me made for the future though: pod2html
works and
generates a pretty identical looking html file. Perhaps it's worth porting over
at some point? http://pod.tst.eu seems to be running a cgi script providing
realtime pod2xhtml
, not sure who owns this but that's how urxvt's
documentation is being rendered currently.
With a bit of work, could definitely transition over to a pod2html
/ static
served html man page setup.
podlators
The current iteration of pod2man
lives within
podlators
.
The crux of the issue that started this all comes from this block of regex:
# Change references to manual pages to put the page name in bold but
# the number in the regular font, with a thin space between the name and
# the number. Only recognize func(n) where func starts with an alphabetic
# character or underscore and contains only word characters, periods (for
# configuration file man pages), or colons, and n is a single digit,
# optionally followed by some number of lowercase letters. Note that this
# does not recognize man page references like perl(l) or socket(3SOCKET).
if ($$self{GUESSWORK}{manref}) {
s{
\b
(?<! \\ ) # rule out \e0(1)
( [A-Za-z_] (?:[.:\w] | \\-)+ )
( \( \d [a-z]* \) )
} {
'\f(BS' . $1 . '\f(BE\|' . $2
}egx;
}
specifically, the recognizition heuristic only matches a portion of
@@RXVT_NAME@@perl(3)
, with or without the L<>
construct.
The resulting patch to podlators
circumvents this guesswork when within the L<>
construct and just generally
bolds the contents of the link (when not a URL), special-casing the man
reference type to not bold the suffixed section number.
& the resulting patch to rxvt-unicode
is
trivial.
Through a mailing list, too :D