Weekly report: week 26, so many questions

I spent this week designing and implementing the CVS integration. Fedora’s CVS is set up so that you can’t download the upstream source packages directly from the CVS. You first download a “sources” file from the CVS and then you can generate the URL where to download the .tar.{gz, bz2} files from. I documented the whole process to http://fedoraproject.org/wiki/SummerOfCode/2007/VillePekkaVainio/SourceRepository

At this point the manimport.py script takes a package name (like yum) and a branch name (like F-7) from the user, downloads the “sources” file from CVS, parses it, downloads the source package, searches for man files (files that end in .number where number is 1-8), extracts the man files into a temporary directory, runs them through doclifter, takes the result and puts it into a wiki page under a “section”/”page name” namespace hierarchy.

I also made a request for resources to the Fedora Infrastructure team. Hopefully sometime soon I’ll have a public test wiki 🙂

According to my original schedule, I should have the publication phase completed next week. I do think it’s possible, there are basically three things that need to be done: improve the CVS handling even more, make the sisterdiff functionality and import info pages too. There are some problems, though, I discussed some of these on #fedora-docs yesterday but maybe I’ll post on fedora-devel-list too. You can also leave comments on this blog. Here are some of the points that I’m currently thinking about:

CVS handling and updates

Now my code takes one package as a command line argument and processes all man pages from that package. But what about the bigger picture? How should I import all of the man pages in one distribution? Do a “cvs co -c”, download every “F-7” package it shows and look for man pages there? That’s a huge amount of packages and it will take a lot of time. On the other hand, the import probably is not done that often.

How should I handle package updates and how can they be seen from CVS? Or should some kind of repository tracking be added too?

Functionally, maybe have a command line switch with which (so many words with w’s and i’s and h’s 😉 ) the user can decide whether to check for updates or do a full import.

Info pages

This is also one interesting problem. Texinfo pages can be converted to DocBook XML with the makeinfo command, but Texinfo is different from the Info format in which info pages are usually distributed in upstream source packages. And to my knowledge there are no Info -> DocBook XML converters. So should I try to collect the Texinfo “masters” of the info pages or how should this be handled? Writing an Info -> DocBook converter at this stage would probably take too much time from other things.

Sisterdiff

Sisterdiff means the ability to compare pages like FC-6/8/yum and F-7/8/yum, this is not possible with a regular Moin installation. This should be pretty straightforward as someone has already made an Action like this for Moin. I’ll start with that and see what it can do.


Posted

in

, ,

by

Tags:

Comments

2 responses to “Weekly report: week 26, so many questions”

  1. Alexander Boström avatar
    Alexander Boström

    Hi!

    I wonder, isn’t it better to just rip out the man pages from the already built binary RPM:s?

    I’ve personally built RPM:s where I in the .spec file have done stuff to the man pages that were in the tar ball In my case, it was converting them from ISO-8859-1 to UTF-8, but some packages might also carry patches to the man pages. Other packages might build the man pages from some SGML source, etc.

  2. Ville-Pekka Vainio avatar

    That does sound like a reasonable way to do it. Patches aren’t really a problem, I can patch the sources before importing the man files. But stuff that gets done is .spec files is a problem when working with CVS. I’ll take a look at rpm-python etc. 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *