Thu, 06 Nov 2008

REVU

Quality of packages not in Debian

While Ubuntu takes most of its packages from Debian, it does contain a few that are not in Debian, for one reason or another. One common reason is simply that someone wanted a package that wasn't in Debian, and packaged it and requested its inclusion.

The fact that these packages are not in Debian means that they are slightly different to the rest of the packages in Ubuntu, as they are completely Ubuntu's responsibility to maintain. By including them we are making a different sort of contract with our users and the authors of the software.

I was interested in how well we do at keeping this contract. A quick bit of scripting and I found some basic numbers on this. There are 886 packages in Ubuntu's universe component that are not in Debian. One possible measure of quality is the number of open bugs against these packages, which is shown in the following table as frequency counts.

Number of open bug reports Number of packages
0 595
1 146
2 49
3 23
4 17
5 14
6 8
7 4
8 3
9 6
10 5
11 4
12 1
13 2
15 2
17 1
21 2
22 1
26 1
30 1
82 1

So there are few open bugs on these packages, and around two-thirds have none. However, there are more than 200 open bugs in total, which we would want to do something about.

To look at how well these bugs are being triaged I looked more closely at those packages with open bugs, and looked at what percentage of those bugs were still in the "New" status with an "Undecided" priority, i.e. completely

Percent of open bugs untriaged, at most Number of packages
20 107
40 35
60 37
80 21
100 91

So again, this isn't too bad, with many packages with less than 5% completely untriaged bugs. Again though, the numbers are too high, and as this only counts those bugs that are completely untriaged, I'm sure that most are not triaged to the level that we would want.

Lastly I wanted to see how much visibility we have in to problems these packages may have. For this I looked at the number of bug subscribers for the source package. While this is certainly not everything, I would certainly fell better if we had one or two people subscribed to the bugs for every one of these packages.

Subscribers Number of packages
0 596
1 232
2 42
3 7
4 6
5 3

This shows that two thirds of the packages don't have anyone subscribed to the bugs for these packages.

What about keeping up with upstream? Shipping old versions of packages means they are likely to be more buggy, and won't be popular with the authors of the software. The Ubuntu External Health Status site attempts to track this using the debian/watch files in the source packages. I attempted to pull the information about these packages from there.

A bit of explanation is in order for those not intimately familiar with Debian packaging. The debian/watch file specifies the location of the upstream source in such a way that new versions can be checked for, and the update of the package be done semi-automatically. This can't be automated for a package that has no watch file, so I list the number of those. Also those packages that have a watch file, but it doesn't work for some reason are listed, as the watch file is not usable for this, though sometimes that is a transient problem.

The figures for the number of packages in each of the three states are:

Out of date 46
No watch file 495
Broken watch file 32

The number of packages that are out of date is not that bad, especially considering we've been frozen for the last few months. However, there is a large number of packages without a watch file. This means that there's no automated way to find out about new versions of the package being released. It's possible to do without that, but I doubt that all 495 packages have someone watching over them.

Conclusion

This is in no way scientific, and I'm very clearly adding some of my bias in to interpreting the results, but here are my conclusions from the investigation.

The packages that we have in Ubuntu universe but not Debian don't have too many bugs, but the ones that they do have are under-triaged, and we aren't that aware of what bugs we have.

Also, we could do more to allow our automated tools for finding out about new upstream releases work better, again making sure we are aware when a package is out of date.

Actions

  1. Firstly, I am going to discuss with the QA team ways in which we can improve the QA on these packages. I will add an item to the QA team's agenda for the next meeting that I am able to attend to do this. A couple of ideas I have are:

    • a hug day to triage the bugs currently open on these packages.
    • a team that is subscribed to all packages not in Debian. I would like to discuss making this the MOTU team so that all reports end up on the mailing list, but that would require more discussion.
  1. Secondly, we should improve the situation with the watch files. I will add the UEHS page with the list of packages without a watch file to a wiki page of easy tasks for people to work on, or perhaps as a harvest data source. The UEHS page also lists packages maintained by the Debian QA team on that page, and while it would be good to fix those in Debian, it may be a good idea to split the lists, not least because the Debian QA team may not appreciate an un-coordinated flood of watch files. I will talk to the maintainers of UEHS about the feasibility of doing this.

Why Ubuntu?

The reason we allow packages directly in to Ubuntu is that it brings benefit to our users. Most packages that enter Ubuntu will be of benefit to someone, and that's one of our aims, to give our users a good experience, isn't it? To give our users a good experience we also want high quality though. So we have a balancing act, splitting our workload between fixing what we have and spending time bringing in and maintaining new packages.

I would argue that we want to limit the flow of new packages quite severely, as we are not exactly short of work with our current set of packages. However, the argument isn't quite as simple as that.

A common way for people to get involved with development is to package something new and get it uploaded, and later branch out in to working on existing packages. This means that bringing in lots of new packages may lead to an influx of new contributors that more than makes up for the new workload. I'm not sure how strong this link will be, and whether the ratios will be such that there is a net gain in developer time.

Actions

  1. Discuss whether the rate of new developers coming from getting a new package in to Ubuntu is high enough to warrant the activity.
  1. I would also like to discuss ways to encourage people to start their involvement by working on existing packages instead.

Why not Debian?

So, some people will be wondering why all these packages aren't in Debian. The overarching reason for this is that contributors would rather get their packages directly in to Ubuntu.

There are several possible reasons for this, and I have heard most of these stated by someone trying to get their package in to Ubuntu.

One is a simple one, they don't run Debian, and so it's difficult to test in that environment. Yes, it's not impossible, but it is more work.

There is also a perception among many contributors that getting your package in to Debian is hard work, with long delays trying to find a sponsor. That's not something that most of us can really do much about. We can try and tackle the perception, but without upload right we can't really fix the problem. There was the utnubu team that tried to streamline this process, but that is now defunct as far as I know.

Another reason is that they may not wish to do this is that they use Ubuntu and don't really care enough about Debian to do the extra work. This is something we can try and do something about, explaining the virtues of getting the package in to Debian, more than just it being the right thing to do in most cases.

As a counterpoint to this, if the package is going to be useful to a lot of people then even if the person proposing it for Ubuntu does not want to try and get it in to Debian then there is likely to be someone with the interest and skills to maintain it in Debian. For me this means that the packages that we want to be pulling in to Ubuntu should be easy to find Debian maintainers for anyway.

Actions

  1. To tackle the first issue we could have documentation of the best way to run Debian to test your packages, and links to the important places to keep an eye on the status of your package in Debian. While not solving the problem it may convince some to do the extra work as they don't have to learn a second way of doing things.
  1. For the second issue it would be great to get the utnubu team going again, but I can't start this, as I am not a DD.
  1. The third issue could again be tackled with documentation, we could have a wiki page explaining some of the virtues, and link this from places like REVU.
  1. Discuss requiring people proposing packages for Ubuntu to at least file a request for packaging bug in Debian. This will give a much better chance that those interested in packaging for Debian are aware of the existence of the package. There are a lot of people interested in packaging for Debian that are just looking for something to work on.

REVU and queues

Now, I want to talk a little bit about REVU, and dealing with queues.

Again, a quick bit of background for those not involved in Ubuntu development. REVU is the tool we generally use in Ubuntu to review new packages. Anyone can propose a package for inclusion in to Ubuntu simply by uploading it there. Developers can then review it and provide feedback, asking for things to be changed where necessary. Once it is to a satisfactory standard and has the support of two people with upload privileges it is uploaded to the archive.

REVU is generally a nice platform for doing this work, and I'm not necessarily criticising its design here, I would just like to examine some of the effects of some design decisions.

REVU works with three queues. The first is the queue for packages that have one advocation from an Ubuntu developer. This queue is normally very short and fast moving, as the package is in good shape and just needs double-checking. The second queue is all of the packages waiting for a review. The third queue is for packages that need some work done on them before they can be uploaded. This means that for developers there is primarily one queue that contains things they can look at.

This is similar to the sponsorship queue, which is used in Ubuntu as a way for people without upload rights to make a change to a package in the archive. This can be seen as just one queue where the developer looks for things to review and upload.

The sponsorship queue is for changes to packages we already have in the archive, and we generally want to upload everything on the list, even if not at a certain point in the development cycle. The difference is that it's generally easier to give a reason for saying "no" for the rare times that it happens. This changes the complexion of what it means to keep the queue small.

If an item on the sponsorship queue is incomplete and the person who submitted it doesn't follow through then we should be picking up the item and ensuring the problem is fixed. If a package on REVU is incomplete and the person who submitted it doesn't follow through then there is no real problem. If the package is popular someone else will eventually pick it up and do the work.

We don't have (or at least haven't had) the time to review enough packages on REVU, so the queue is pretty long. Assuming that doesn't change keeping the queue small will mean removing things because of lack of interest from the person submitting them. We can work to review more things, but unless we reach the point where we are accepting more packages than are being proposed that will always be the case.

There was recently a proposal to help clear the queue of things where the submitter has given up, so that effort can be focused on the packages where problems are likely to be fixed. While doing this is makes best use of developer time I feel that if we feel the need to do this then we have already lost.

If that is the case then there is a worrying aspect to it. We have had a bunch of people show an interest in Ubuntu development, and take the first steps towards becoming a developer, only to get discouraged and give up.

I have heard complaints about it being difficult to find a reviewer, and I'm sure the people that gave up would not speak fondly of the experience. However, this hasn't become widespread enough that it has stopped people giving it a go, and it would be terrible if it did, as having that reputation may cause potential contributors to look elsewhere.

I think this indicates that we should reconsider the way REVU is presented within our community. We are presenting a great service to people who are getting started with packaging and pointing them there saying, put your package here, "we will review it and you will get it in to Ubuntu." However, this unfortunately doesn't happen all that often. We are channeling potential new developers there, knowing that there is a high chance of them getting discouraged and giving up. On top of that we give the reason that many of them do give up as one reason that we're not better at reviewing the packages in the first place.

To me this almost feels like we are trying to put people off from getting started with Ubuntu development.

It looks as though the proposal to help clear the queue for the start of this cycle is going to go ahead. That's fine with me, we might as well make the most of a bad situation and help the reviewers this cycle. If we are in a position where we feel we need to do the same at the start of next cycle then I would say that it is clear we have not fixed the underlying issues, and we really need to stop and reconsider how we present REVU.

There is also a slim possibility that the clearing of the queue may have a different effect. If some contributors are put of by the number of packages waiting for review, then clearing that may cause them to put their own up there. While this could imply more potential developers, it would also mean more packages to review, making it harder to keep up.

If we keep up this cycle and have a good process going in six months time then I will be happy, but it will make the rest of this post pretty much obsolete. I think now is a reasonable time to discuss the "what if?" though.

Changing things

If we are in the same situation in six months then I think it will demonstrate that even with a clean slate and a will to fix the problem we were not able to do enough reviews to keep up with the supply. In that case we will need to look for ways to turn down the taps, to reduce the supply.

In no way do I want to send the message that we shouldn't be welcoming to new contributors, I would just want to explore ways to get them started with working on our existing packages. It may involve making it harder to get a package in to Ubuntu though. We may just have to live with that.

In my opinion there would be two ways to target this, and the solution may come from some combination of the two. The first would be to make working on existing packages more attractive, easier to start with, and the default choice. The second would be to make proposing a new package more difficult, less attractive, or at least not demoralising if your package isn't reviewed.

If I had any bright ideas for the first then I would already be trying to implement them, so let's focus on ways to do the second. Keep thinking about the first though, and feel free to discuss any ideas with me.

I'll take three ideas from the recent mailing list thread and one I just had and look at what effects they may have.

Restricting access to REVU

Michael Casadevall proposed restricting those who could upload to REVU. In particular he proposed restricting it to the "Ubuntu Universe Contributors" team. These are people that have been given Ubuntu membership in recognition of their contribution to Ubuntu development. This includes all Ubuntu developers, but also others who haven't been given upload rights yet.

This would restrict REVU to those who have contributed to Ubuntu development in other ways, and so would make it clear that to get started you should work on existing packages.

In my opinion it also helps with the QA issue, as these people have shown sustained contribution, and are not going to disappear as soon as the package is uploaded. When looking for packages on REVU I first look for names that I know partly for this reason.

As a concession to those that haven't yet reached this stage but want to get a package in for some reason we could have a kind of sponsorship process where a member of the team can put a package there for someone who isn't if they think it is worthy. (While the normal sign and upload process would work I imagine we would only want to do this at most once, with the person that proposed the package being able to upload that one to make changes according to feedback.)

Asking for a rationale

REVU could grow a field for the person that proposed the package to give their reasons why they think the package should be included in Ubuntu.

While it is currently feasible to reject a package from REVU because it is not deemed worthwhile I'm not sure that it ever happens. In my opinion we could use this field as a basis for doing so, perhaps allowing the proposer to counter and give more reasons.

A more subtle effect I believe it would have would be to make the proposer think about these issues. There could even be several fields asking about various different things, such as responsiveness of upstream, potential number of users, etc. While it probably wouldn't stop anyone from uploading we could change the REVU help pages to explain that we upload packages based on their merit, and so if their package is not reviewed they may understand.

A related idea was to link the packages to brainstorm, so that we could gauge user interest.

Making REVU per-cycle

My second proposal was to make REVU per-cycle. This would mean that you don't propose a package for inclusion in to Ubuntu, but for inclusion in to the next release. REVU would open the day after a release with a clean slate, and then close sometime around Feature Freeze, with packages that were still on there being archived with a message that they were not successful this time.

I imagine that this would actually include the rationale field, asking why it should be included this cycle.

This goes further than the previous proposal though, in that it ensures a clean slate to work from.

More importantly in my opinion is that it focuses on the rationale even more in my opinion. It asks the question, "For the next release, would you rather we spent time reviewing this package, or fixing those annoying bugs that you hit?"

However, I'm not sure that this proposal does enough to encourage people to work on existing packages to counteract its harshness.

Order packages by uploads

We could change the ordering of the packages from chronological order to ordering by the number of uploads to Ubuntu done by the submitter. We would then work top down on that list (obviously being able to pick packages from anywhere if we like).

This would draw an obvious link, upload a bug fix, get more chance of a review, but do so in a more subtle way, and not penalise brand-new contributors too much (there are plenty of bite-size tasks that are easy if you're able to create a package from scratch).

This wouldn't work that well if there was little distribution in the number of uploads, or the top of the list wasn't reviewed very often, but it could be a good way to work.

We could obviously substitute number of uploads for something else, for instance launchpad karma, or mix the numbers somehow. This would mean that the link was less direct, but it would reward the fantastic triagers, translators, etc. that want to get in to packaging.

Conclusion

I don't really have a conclusion, but I have plenty to think about, and I hope you have too. I'll be more than happy to discuss these issues at any time, so just give me a shout.

Posted at: 02:00 | category: /ubuntu | Comments (13)


Tue, 21 Oct 2008

Ups and Downs

I was thrilled this morning to finally come up with what I hope is a fix for a bug in ConsoleKit that has been plauging a lot of users, judging by the number of subscribers and duplicates on the Ubuntu bug

I was happy to get upload fixes for various other bugs and sponsor some more from other members of the community.

I was pleased to see a group of people to come together to prepare uploads for the 2.24.1 release of GNOME.

I enjoyed going to my favourite curry restaurant for lunch and listen to the stories from my friend's trip to Malawi.

I was disappointed to read an article on the dailywtf.com today after I saw a pointer to it.

I was saddened to read this post in response.

Carolyn, if on the off chance you read this post, I'm sorry that you feel that way. I can only hope it doesn't end your involvement with our community, though I would understand if it did, it can't be easy to be involved with a community which makes you feel like that, no matter how infrequently. I can only promise that I will try and discourage things I see which I feel are likely to provoke similar reactions, and do my best to build communities that are welcoming. I also apologise in advance for those times when I fall short.

-- This post belongs to Lionel Richie

Posted at: 01:57 | category: /tech | Comments (0)


Fri, 12 Sep 2008

Some advantages of having packages in Bazaar

After some discussions in the last week or so I decided to finally get around to something that I have been promising for ages: making a screencast about what is good about having packages in bzr. It's basically just me rambling for almost 20 minutes, so don't expect anything slick or compelling. Also, I have to apologies for speaking too quickly and for the skips in the audio. I'll improve this for next time.

If there are areas that you would like me to go in to in more detail in this format then leave a comment and I'll see what I can do. As usual feel free to grab me if you would like to discuss anything, and there are lots of interesting things that could be prototyped if you are in to that.

Download the video

I'm not sure where else to put this so that those who don't read planet could find out about it. It doesn't seem like something that should be sent to a mailing list, and just dropping a note on IRC isn't that reliable. Any suggestions would be welcome.

Posted at: 02:41 | category: /ubuntu | Comments (3)


Thu, 28 Aug 2008

Making Intrepid Solid

Making Intrepid Solid

With feature freeze now in effect the bulk of the big changes in Intrepid should now be done. There will still be new features entering the archive with the appropriate exceptions, but the rate will slow as we move forward.

Now we really need to focus on making Intrepid solid. We want to squash as many bugs as possible, so that when we deliver the final release it is something we can be proud of.

This is something that everyone can help with. There are plenty of ways to help out, so there should be something for everyone.

Testing

Simply running Intrepid and reporting bugs is a great start. It's still not recommended to run it if you aren't able to fix a system that doesn't boot, or where X doesn't start, but if you are then upgrading now will be a great help.

You can do more than just using the system though, pick an application and start testing all of the functionality, and report the bugs that you find. Some bugs only show up in certain locales, with certain hardware, or with certain combinations of packages, so try different things and look for serious problems.

Upgrade testing is an area that is under-tested until the last minute when floods of users upgrade. Also the easiest testing to do lots of is upgrading pretty standard installations, but this doesn't catch a lot of problems. So, when you are comfortable with running Intrepid upgrade and let apport file any upgrade problems that you find. You should also be on the lookout for unnecessary prompts that happen while upgrading, or packages that are left broken by the upgrade.

Even if you are not happy running Intrepid yet you can still potentially help upgrade testing, thanks to the unstoppable Michael Vogt. He has written a tool that will clone your system in to a kvm virtual machine, and then upgrade that. This means you can test a real world upgrade without risk to your system. If you do this a few times during the remaining time for Intrepid and file bugs, then you will have a much better chance of a hassle free upgrade to the final release. You can find more details on Michael's work here. (Not everyone has kvm capable hardware though unfortunately.)

Looking at bugs

As well as trying to find your own bugs you can look at the ones that other people have already found. There are several important things here.

The first is bug triage, trying to make sure that a bug report has all the information that it needs, and trying to set an appropriate priority. This is really important work, and we always need more help doing it, so consider joining the bugsquad and helping out.

At this time important bugs should also be milestoned so that they can be concentrated on for the release if possible. Deciding the different classes of bugs here is really tricky, and there can often be disagreements. It is important work though. If you see a bug that should probably become release critical then work with the bugsquad to triage it, and make sure to suggest that it is considered for release-critical status.

Developers can help by actually trying to fix these bugs. Some can be easy, for instance if they are known to be fixed elsewhere. Some can be really complex, and take a lot of effort. Fixing things from the release-critical bug list, and lists of other important bugs is always valuable.

Doing the easy things

There are some ways to improve the quality that are actually fairly easy. Though there is a freeze in effect in Ubuntu there is still loads of work going on elsewhere, and many, many bugs being fixed elsewhere. Pulling these fixes in to Ubuntu will improve the quality, while in theory being easier than coming up with a fix for a bug.

We should keep an eye on upstream projects and pick up bug fix point releases to the versions that we have. If the project doesn't do this then look out for important fixes going in to trunk and back-port them. If you are doing that then it can be worthwhile looking at the versions in other distributions that plan to release soon, and if they carry the same version suggesting that you share the workload of creating these point release, or at least collaborate on fixes and share them.

Adding external bug watches in launchpad is also a great way to help. Jorge explained this recently. This helps easily spot when there is a bug fix that we could pull in. When there is they will appear on harvest

Harvest (now with a new look) is another easy way to do things. It lists opportunities to fix things that should be fairly easy, such as bugs fixed elsewhere, or bugs with patches attached.

Making Intrepid+1 rock

In parallel with all the above now is the time to start thinking about what you want to achieve in the next cycle. If any of that requires changes to an upstream then speaking to them early can be a good idea, as you can get their feedback and see how it fits in to their plan. I'm sure everyone has loads of ideas, and a bit of preparation now can help you hit the ground running in the next cycle.

Posted at: 12:51 | category: /ubuntu | Comments (0)


Tue, 26 Aug 2008

I love a bad book

Last night I finished reading "Exit A" by Anthony Swofford. I had decided a while ago that I didn't like it, but it wasn't so bad that I had to put it down, so I stuck with it to the end. It wasn't a bad book, it was just poor in places, and disappointing overall.

I did prefer the act of reading the book to the book itself though. I have just read 10 or 20 draw-droppingly good books in a row. The previous book was "Disgrace" by J. M. Coetzee, which is stunning. Read it. I was beginning to think that I just possibly enjoyed most books a lot. Reading a book I didn't enjoy showed me that I just read a lot of good books.

I saw this one as a new release in a bookshop, and it had a piece of card with it, written by one of the members of staff in the shop. The card said something like "Swofford could have been forgiven for writing a poor second book, but he doesn't need to be, he can really write." I agree for the most part, he certainly could have been forgiven, and this book isn't bad enough that he really needs to be. It however not a great book, unlike "Jarhead", which I haven't changed my opinion on.

It wasn't just the act of reading a mediocre book that buoyed my spirits though. I wanted to like the book, so it wasn't just that it's not a famous classic like "Disgrace", so I'm not that much of a book snob. Also, it was really the card that caused me to buy it, even though I was drawn by the author's name, so it shows that my instincts are good, which gives me confidence when choosing books in the future.

The final aspect is the one that makes me happiest though. I know why it's not a great book. I can point to places in the book and tell you why they are bad. At school I was terrible in English classes, I didn't understand the first thing. Reading this book gave me confidence that I am learning while reading. Not only learning about life and the world, which I was already concious of, but also learning about language and writing. Even though I'd never be able to write like I would like to, I can at least comfort myself with the knowledge that I am at least able to partly understand the mechanics of good writing.

Posted at: 01:46 | category: /life | Comments (0)


Sun, 17 Aug 2008

Help needed

The server team asked me to write a blog post to ask for help removing the use of "multiuser" as an argument to update-rc.d. This used to be the way that we sped up the shutdown time a little, but we've changed the approach now.

This is great, as it means that we can get rid of some of our diff to Debian, as well as helping Debian to get the improvements (the original approach was never accepted by Debian, the new one has been).

However, I no longer need to write the post. Thanks to a few people most of the work has now been done. In particular I would like to thank Didier Roche, Nicolas Valcarcel, and Cesare Tirabassi, as well as the sponsors that uploaded their work when needed.

However, I thought I'd write the post anyway, as there is still loads of work to do, so if you are interested in helping out with development then come and get stuck in. I'll try and post about some specific tasks in the future, hopefully before somebody does all of the work next time.

-- This belongs to Lionel Richie

Posted at: 00:47 | category: /ubuntu | Comments (0)


Sat, 16 Aug 2008

Does it just work?

I recently attended Lugradio Live in Wolverhampton. During the live recording of the show they were discussing how things have changed since the started the show. Aq was saying that things are less interesting nowadays, as everything just works. I disagree that it's less interesting, not spending time compiling drivers for my video card means I can spend time on other things.

I bought another laptop this week, and so I can personally attest that the developers involved should be proud, all of the hardware was supported out of the box, with just a few minutes following instructions on a wiki page to get everything working properly. Also, all of those little tweaks won't be needed in a few months time, as the developers have fixed the problems.

However, I also disagree that everything just works. Even though it took me only a couple of hours to install and get all of the hardware working, I'm still setting up the laptop. Why do I have to spend ages configuring all of the applications, even though 90% of the settings will be the same as on my other laptop? I could copy across my dotfiles, but there could be more done to help me move just those that make sense to be on another machine. (I don't think I would want to copy the whole of .mozilla/)

There's more though. Though I have two laptops running linux next to each other, it's not that easy to move a file between them. Why do I have to enter details of my contacts more than once? Why isn't it trivial for me to send off an email to someone I am chatting to on IRC? I could go on.

The work on the kernel, drivers and installers that meant that it only took me a couple of hours to get up and running is a fantastic achievement; it's what allows us to ask questions like these. There is more that needs to be done in these areas, but we need to expand our ideas of what should just work.

I don't wish to discredit those that are working on this sort of problem, there should be more people helping them. We need other developers to appreciate the issues, and support those trying to tackle them.

If you agree with me then don't complain about it, fix it. Find a project working on a problem that you care about and support them however you can. I realise the irony inherent in telling everyone this at the end of a post like this, I hope you will forgive me.

-- This belongs to Lionel Richie

Posted at: 23:37 | category: /tech | Comments (4)


Thu, 14 Aug 2008

MOTU School sessions for Ubuntu Developer Week wanted

Next month we have another Ubuntu Developer Week. It's still in the planning stage, and there will be a proper announcement later, so if you are interested in attending wait for that.

This post is for those who are in a position to give sessions. I want to get several MOTU School sessions included in the schedule, but for that I need presenters willing to give them.

There's a list of some ideas for sessions at

https://wiki.ubuntu.com/MOTU/School/Requests

as always. If there is a session there that you would like to give then get in touch with me. It doesn't have to be one from that list, I'm interested in any session that you are willing and able to present.

In particular I'm really keen to see the sessions on Java this time. I'll speak to the Soyuz team to see if they are willing to present a session, as that one has several votes. I'll also probably present a bzr session with David, and a packaging with bzr session.

If you want to have a session during the week on something then stick your name and the title on

https://wiki.ubuntu.com/UbuntuDeveloperWeek/Prep

or grab dholbach or me to discuss it.

Speaking of dholbach, let's make this developer week ROCK!

I'll leave you with Eugene O'Neill:

There is no present or future,
only the past happening over and over again,
now.

Posted at: 01:37 | category: /ubuntu | Comments (0)


Thu, 07 Aug 2008

Tempted by a stage dive

Hello Planet Ubuntu.

This feels like I've just made it on stage with my favourite band.

Today I was accepted as an Ubuntu member through the Universe Contributors group. Thanks to all those that helped me achieve this. There's been a load of other people join this group recently. Hopefully this is a sign of things to come, and we're going to have some great releases coming up.

It's an interesting time to be joining planet, with yesterday's CC meeting discussing what role the planet should play in our community. I think that Emma had a good post touching on how we should conduct ourselves. (Hey Emma, when are you going to become a member so that more people see your insightful posts? And the animal ones too, I like those)

Emma refers to a poem by Robert Frost in her post, I had heard of it before, but I had never read it, so I hunted it out. I recommend reading it.

Mending Wall
Robert Frost (1874-1963)

SOMETHING there is that doesn't love a wall,
That sends the frozen-ground-swell under it,
And spills the upper boulders in the sun;
And makes gaps even two can pass abreast.
The work of hunters is another thing:
I have come after them and made repair
Where they have left not one stone on a stone,
But they would have the rabbit out of hiding,
To please the yelping dogs. The gaps I mean,
No one has seen them made or heard them made,
But at spring mending-time we find them there.
I let my neighbour know beyond the hill;
And on a day we meet to walk the line
And set the wall between us once again.
We keep the wall between us as we go.
To each the boulders that have fallen to each.
And some are loaves and some so nearly balls
We have to use a spell to make them balance:
"Stay where you are until our backs are turned!"
We wear our fingers rough with handling them.
Oh, just another kind of out-door game,
One on a side. It comes to little more:
There where it is we do not need the wall:
He is all pine and I am apple orchard.
My apple trees will never get across
And eat the cones under his pines, I tell him.
He only says, "Good fences make good neighbours."
Spring is the mischief in me, and I wonder
If I could put a notion in his head:
"Why do they make good neighbours? Isn't it
Where there are cows? But here there are no cows.
Before I built a wall I'd ask to know
What I was walling in or walling out,
And to whom I was like to give offence.
Something there is that doesn't love a wall,
That wants it down." I could say "Elves" to him,
But it's not elves exactly, and I'd rather
He said it for himself. I see him there
Bringing a stone grasped firmly by the top
In each hand, like an old-stone savage armed.
He moves in darkness as it seems to me,
Not of woods only and the shade of trees.
He will not go behind his father's saying
And he likes having thought of it so well
He says again, "Good fences make good neighbours."

From http://www.bartleby.com/118/2.html

Posted at: 02:28 | category: /tech | Comments (0)


Tue, 24 Jun 2008

bzr-upload

On Friday bzr-upload was officialy announced. In the blog announcement Martin tells the story of how this plugin was born. I'm proud to say "I was there!". I was sat at the table with them that evening as they discussed what was wanted, and what was possible. This was just one of the great things about the last sprint for me, and I was only there for two days.

Very often on the #bzr IRC channel we have users asking why bzr doesn't update the working tree on a remote machine, and what they can do about this. Very often it turns out that they are web developers who are looking to deploy a website, rather than just host the branch, and so the normal behaviour is kind of the opposite of what they want.

John wrote the push-and-update plugin to help with this, but it didn't fulfil all the needs, and requires ssh access, where web developers sometimes only have ftp access.

While bzr-upload does have some corner cases to be wary of, it's a great thing to have available. If you are web developer who is looking for a version control system for your code then consider bzr, it will hopefully suit your workflow very well.

Now, watch out for Martin's improvements to loggerhead, and Vincent's improvements to his kitchen.

Posted at: 20:38 | category: /bzr | Comments (0)


Tue, 13 May 2008

Gutenberg's Revolution

I love Stephen Fry, everything he does is great, but also seems to come with a touch of quality as well. His documentaries are probably a lesser known part of his work, but they are equally fantastic; the two-part documentary on manic depression was particularly notable.

Last night a watched a new documentary presented by him, and while it was neither as moving or as personal as the others I have seen it was still interesting and enlightening. This particular documentary was about Johannes Gutenberg, the printing press that he invented, and the impact which this had upon the world.

He explained that the printing press, and the increased access to knowledge that it allowed, was a major factor in the Renaissance, which radically changed the world, and can be seen in the world in which we live today.

In Gutenberg's time the Church was the most powerful organisation. He worked with the church, and tried to show them the benefits of his idea to them. It was suggested that he would never have succeeded if he had not courted the Church. If the printing press was indeed the catalyst for the Renaissance, and the Renaissance was the start of the decline in the power of the Church that we see today, then the Church's embrace of the printing press could be said to have precipitated their loss of power and influence.

Are there any ideas at the current time that are as powerful as the idea of the printed word? During the programme their were a couple of references to the growth of the printing press being similar to that of the growth of the Internet in recent years.

The Internet, like the printing press before it, allows a new group of people to have direct access to information. Will that access cause a fundamental shift in our world and our lives?

Is there a dominant force in the world which will be diminished by the Internet? Are they currently embracing it as a tool which they can use to entrench their position? Are we thinking too small; is there something we haven't thought of yet that is going to have an even more radical impact?

Posted at: 11:54 | category: /tech | Comments (0)


Fri, 25 Apr 2008

A long anticipated release

Now that one important release is out the way it's time to look forward to another one. This has been long anticipated, and will probably decide whether they go on to great new heights, or are just remembered for their previous contributions, but aren't considered relevant any more.

No, I'm not talking about the Fedora release, that is definitely relevant, and is sure to be great. I'm talking about the release of "Third", the new album from Portishead. Fingers crossed it's going to live up to the legacy.

Oh, and if you don't know, get to know.

Posted at: 15:17 | category: /life | Comments (0)


Jerk Pork Burger with Green Apple Slaw

Thanks to richb I found this recipe for "Jerk Pork Burgers with Green Apple Slaw", and they are fantastic. The Slaw is particularly good, I'd really recommend you try it, even if it does sound a little weird.

Unfortunately, it appears as though "The Bell", a pub just round the corner, no longer serves food. No matter how good a burger is, I'm sure there's no way it could be as good as a Bell Burger. It seems I'm stuck spending the rest of my days trying in vain to recreate that marvellous piece of meat technology.

Posted at: 14:59 | category: /food | Comments (1)


Thu, 13 Mar 2008

Revision numbers

Revision numbers vs. revision ids

One thing that Bazaar does a little different to the other distributed systems is to give every revision a revision number. Some people don't like this as the revision numbers are global, that means that the revision number of a revision in my branch does not necessarily match the number that it was given in your branch. Some people say that this makes bzr somehow "less distributed." This is not the case at all, you just need to be careful to be clear what branch you are referring to, i.e. say "revision 315 on branch http://...", rather than just "revision 315".

This is a little dangerous in that the branch may have it's revision history changed, for instance by uncommitting and then committing again. If that may happen then you should use revision ids, which you can find from "bzr log --show-ids".

Why do number the revisions at all then? One reason is simply that some people find revision ids ugly, and may be scared off by them.

Another reason is that they are shorter to type. git folks will tell you can use the first few characters of the revision id, and git will work out what revision you mean. However these shortened ids are not necessarily stable over a long period of time, and so again, if you are worried you should use the whole thing.

The third reason is that the numbers can give some sense of the order of the revisions in your branch. If I say talk about revisions "3445abe" and "b27ac9" then you don't know which is earlier in history. If I refer to revisions 345 and 532 then it is immediately obvious (providing that I am only referring to a single branch).

The advantages I have outlined are small, but they could be valuable at times, and you always have the revision ids to fall back on if you require them.

Numbering merged revisions

Along with numbering the mainline bzr also numbers merged revisions using a dotted numbering scheme. This means that your mainline revisions are "1", "2", "3", as you would expect, but any merged revisions are given three digit numbers, e.g. "2.1.3".

The numbering scheme has a couple of nice properties, the most notable of which is that it is "stable", this means that once I have numbered a merged revision with respect to a certain mainline it cannot be effected by any addition I make to the revision history. This means that any commits, pulls or merges that I do will not change any of the existing revision numbers, but they will add numbers to any new merged revisions such that they will not be the same as any number already used.

The current algorithm used for this involves looking at the whole history of the branch to number the revisions, which is obviously undesirable.

On the last evening of the sprint last week myself and John were discussing the numbering scheme, and thinking about possible algorithms to do the numbering that would be more efficient.

We had the following inputs:

  1. The revsision id you are trying to give a dotted number for.
  2. The tip of the mainline that you are numbering against.
  3. The revision number of that mainline revision.
  4. A map of revision_id -> parents.

And you are asked to provide the revision number for the specified revision. Any other numbering that you may be able to do along the way would be a bonus.

The fact that all we are given to get the information we need is a map telling us the parents of a revision id means that we cannot ask the question "what are all the children of this revision?"

I don't really want to explain the numbering scheme here, as it is a little long-winded to do so. The outline is that for the first digit you find the intersection of the target revision's left hand ancestry with the mainline, and use its revision number. For the second digit you find all of the branches that originated at the revision found in the first part, and then number them by the order that they merged back in to mainline. The third digit is then just the place of the revision in its own part of one of these branches.

Notice the second step there. Remember that we are not able to retrieve the children of any revision? That means that we must work backwards from our mainline to do this. This is where the real complexity comes in, and it appears as though it is necessary to search a reasonable amount of history to calculate this part.

After the discussion with John I had a reasonable idea of how such an algorithm would work, and yesterday I posted a first draft of that to the mailing list. We have found some problems with it, and haven't benchmarked it yet to see if it is actually an improvement, but hopefully it will evolve and prove to be faster.

Displaying logs, and history emphasis

The revision numbering code has a very close relationship, and also interacts with it in an awkward way from a performance standpoint. This lead to John explaining to me how the logs are generated in more depth.

When bzr produces logs by default it emphasises the left hand parent to produce your mainline. It then indents any revisions that you merged:

>       -----------------------------------------------------------
>       revno: 3270
>       committer: Canonical.com Patch Queue Manager <pqm@pqm.ubuntu.com>
>       branch nick: +trunk
>       timestamp: Thu 2008-03-13 00:40:30 +0000
>       message:
>         (Adeodato Simo) Add a space after "revision-id:" in log output.
>          -----------------------------------------------------------
>          revno: 3257.2.1
>          committer: Adeodato Simó <dato@net.com.org.es>
>          branch nick: foo
>          timestamp: Sun 2008-03-09 23:06:47 +0100
>          message:
>            Add a space after "revision-id:" in log output.
>       -----------------------------------------------------------
>       revno: 3269
>       committer: Canonical.com Patch Queue Manager <pqm@pqm.ubuntu.com>
>       branch nick: +trunk
>       timestamp: Wed 2008-03-12 23:08:34 +0000
>       message:
>         (Daniel Watkins) Add a --revision option to 'bzr push'
>           -----------------------------------------------------------
>           revno: 3256.1.5
>           committer: Daniel Watkins <D.M.Watkins@warwick.ac.uk>
>           branch nick: push-r
>           timestamp: Sun 2008-03-09 18:41:31 +0000
>           message:
>             Added NEWS entry.

To do this it must decide which revisions are present in the history of one revision, but not in the history of its left hand parent. To do this it starts off two history walkers in parallel, one walking the history of the first revision, the second walking the history of the parent. The first walker then stops walking down a particular line of history when the second "claims" it, once the first walker has no more lines of history to walk it returns its group of revisions, and the log formatter code then displays them indented as necessary to match the history.

This is a much more complex process than that you get with "git log", in which the revisions are produced in just date order. There is a "--topo-order" option to git log, but that just ensures that all parents are output before their children. It doesn't ensure that all parents not in the ancestry of the left-hand parent are shown before the left-hand parent. The work to ensure that is significantly more than that done to provide "--topo-order".

This display makes it easy to see what work was done on a branch, and when those changes entered your branch. This is one reason why bzr's merge doesn't fast-forward by default ("bzr merge --pull" will do this for you if you like). This means that you can always instantly identify which work came from another branch and have them tied together.

Always having merge commits means that "bzr log --short" and "bzr log --line" can give you a good summary of what happened on your branch, the commits you did, and the things that you merged. It preserves a mainline for you in the left hand ancestry, which means that you can always see what happened in that particular branch. "bzr pull" then gives you a mirror of another branch, and the left hand ancestry tells you what happened in that branch.

The indentation of the merged commits (and the fact they disappear with "--short" and "--line") means that mentally they become of lesser importance. You see "merged performance work from Emma's branch", rather than all of the commits that you got from her. They are still there to look at if you want, but they can be ignored at most times.

This means that you don't have to spend time rewriting history to be clean if you don't want to. You don't have the history right in your face either way, though there can still be value in having a clean history. However rewriting history is not what some people want to do, and causes problems for those who base their work on yours.

Posted at: 16:49 | category: /bzr | Comments (0)


Wed, 12 Mar 2008

Version control systems and text editors

So apparently "learning git is like learning vim". Putting aside the incremental learning aspects of this, and stretching the point a little, will you allow me to say "git is like vim"?

We all understand there is no way in which you would mandate that all contributors use vim. You wouldn't want to lose all of those valuable contributions from emacs users of course. However, you still wouldn't dream of mandating the use of one of these two editors. Why should your choice as project maintainer constrain the way in which others want to work?

Obviously it is quite difficult to enforce this editor rule. For a start there is nothing in a plain text file that tells you what editor was used to create it. More importantly though, the contributor's choice of editor doesn't matter to you. If they send you a plain text file then your editor will handle it just as well as theirs.

This is where version control differs from editors. When using the version control system to move code around it tends to dictate the client you use to access it, so one person's decision tends to impact on others.

Is the solution therefore to work towards a situation we have that is similar to that we have with text editors, where the interchange format is understood equally well by all of the tools? Do we spend time developing wrappers for each use that allow us to ignore the fact that we are using different systems?

Recently there has been work done to make bzr support the git-fast-import format. This would then be the start of an interchange format that all tools could use to communicate. However, the problem is that the representations used in one system start to bleed. For instance, bzr supports ghosts, and we are currently discussing the adding support to the format to represent them. However git doesn't support them, and as such there will be know way to complete a round trip of bzr->git->bzr when there are ghosts involved.

So, what about the other solution? Creating wrappers that allow the user to not care what VCS they are using and just get the job done? I think this is useful to a point. It will be great for some people who just want to do really simple things on lots of projects (for instance in Debian). However the tools are necessarily catering to the lowest common denominator, they won't support any of the unique things that make each system great.

Bazaar has foreign branch support (most notably bzr-svn) which allow you to access another system as if if were bzr. This is almost completely transparent ("bzr branch svn://" makes it clear what the project is hosted in), in contrast to git-svn. The latter adds a new command that allows you to do the svn specific parts (setting up the repository, committing back to svn). In contrast bzr-svn uses the normal bzr commands for (almost[1]) everything, meaning you only need to learn the one tool. git-svn is still a great tool, but it certainly makes you realise that you are not dealing with pure git.

The competition between the systems has been great for every one of them. However, it seems like we will be stuck with different systems for the forseeable future, so we should work hard on making them work well together to ease the pain on the users. I think that many of the supporters of distributed version control would say that it is better for you to be using any of them than none of them, but the fractured and unstable landscape we have now is causing a resistance in people to make the switch.

[1]It currently adds svn-push for doing a push that creates a new branch in svn, but this is only a temporary thing, "bzr push" will be able to do this at some point. The other commands that are added are for extra things that the core bzr is not meant to deal with.

Posted at: 01:25 | category: /bzr | Comments (0)