I spent this week designing and implementing the CVS integration. Fedora’s CVS is set up so that you can’t download the upstream source packages directly from the CVS. You first download a “sources” file from the CVS and then you can generate the URL where to download the .tar.{gz, bz2} files from. I documented the whole process to http://fedoraproject.org/wiki/SummerOfCode/2007/VillePekkaVainio/SourceRepository
At this point the manimport.py script takes a package name (like yum) and a branch name (like F-7) from the user, downloads the “sources” file from CVS, parses it, downloads the source package, searches for man files (files that end in .number where number is 1-8), extracts the man files into a temporary directory, runs them through doclifter, takes the result and puts it into a wiki page under a “section”/”page name” namespace hierarchy.
I also made a request for resources to the Fedora Infrastructure team. Hopefully sometime soon I’ll have a public test wiki 🙂
According to my original schedule, I should have the publication phase completed next week. I do think it’s possible, there are basically three things that need to be done: improve the CVS handling even more, make the sisterdiff functionality and import info pages too. There are some problems, though, I discussed some of these on #fedora-docs yesterday but maybe I’ll post on fedora-devel-list too. You can also leave comments on this blog. Here are some of the points that I’m currently thinking about:
CVS handling and updates
Now my code takes one package as a command line argument and processes all man pages from that package. But what about the bigger picture? How should I import all of the man pages in one distribution? Do a “cvs co -c”, download every “F-7” package it shows and look for man pages there? That’s a huge amount of packages and it will take a lot of time. On the other hand, the import probably is not done that often.
How should I handle package updates and how can they be seen from CVS? Or should some kind of repository tracking be added too?
Functionally, maybe have a command line switch with which (so many words with w’s and i’s and h’s 😉 ) the user can decide whether to check for updates or do a full import.
Info pages
This is also one interesting problem. Texinfo pages can be converted to DocBook XML with the makeinfo command, but Texinfo is different from the Info format in which info pages are usually distributed in upstream source packages. And to my knowledge there are no Info -> DocBook XML converters. So should I try to collect the Texinfo “masters” of the info pages or how should this be handled? Writing an Info -> DocBook converter at this stage would probably take too much time from other things.
Sisterdiff
Sisterdiff means the ability to compare pages like FC-6/8/yum and F-7/8/yum, this is not possible with a regular Moin installation. This should be pretty straightforward as someone has already made an Action like this for Moin. I’ll start with that and see what it can do.
Leave a Reply