<?xml version="1.0" encoding="utf-8"?>
<!-- name="generator" content="pyblosxom/1.3.2 2/13/2006" -->
<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/formats/rss-0.91.dtd">

<rss version="0.91">
<channel>
<title>James Westby   </title>
<link>http://jameswestby.net/weblog</link>
<description></description>
<language>en</language>
<item>
  <title>Improving the usability of launchpadlib-using code</title>
  <link>http://jameswestby.net/weblog/tech/16-Improving-the-usability-of-launchpadlib-using-code.html</link>
  <description><![CDATA[
<div class="document">
<p>Normally when you write some code using launchpadlib you end up with Launchpad
showing your users something like this:</p>
<img alt="/images/lplib-before.png" src="/images/lplib-before.png" />
<p>This isn't great, how is the user supposed to know which option to click? What
do you do if they don't choose the option you want?</p>
<p>Instead it's possible to limit the choices that the user has to make to only
those that your application can use, plus the option to deny all access, by
changing the way you create your Launchpad object.</p>
<pre class="literal-block">
from launchpadlib.launchpad import Launchpad

lp = Launchpad.get_token_and_login(&quot;testing&quot;, allow_access_levels=[&quot;WRITE_PUBLIC&quot;])
</pre>
<p>This will present your users with something like this:</p>
<img alt="/images/lplib-after.png" src="/images/lplib-after.png" />
<p>which is easier to understand. There could be further improvements, but they would
happen on the Launchpad side.</p>
<p>This approach works for both Launchpad.get_token_and_login and Launchpad.login_with.</p>
<p>The values that you can pass here aren't documented, and should probably be constants
in launchpadlib, rather than hardcoded in every application, but for now you can
use:</p>
<ul class="simple">
<li>READ_PUBLIC</li>
<li>READ_PRIVATE</li>
<li>WRITE_PUBLIC</li>
<li>WRITE_PRIVATE</li>
</ul>
</div>

]]></description>
</item>

<item>
  <title>Re: Getting the hobbyist back</title>
  <link>http://jameswestby.net/weblog/tech/15-Re-Getting-the-hobbyist-back.html</link>
  <description><![CDATA[
<div class="document">
<p>Dear <a class="reference external" href="http://blogs.gnome.org/bolsh/2010/04/29/getting-the-hobbyist-back/">Mr Neary</a>, thanks for your thought provoking post, I think it is a
problem we need to be aware of as Free Software matures.</p>
<p>Firstly though I would like to say that the apparent ageism present in your
argument isn't helpful to your point. Your comments appear to diminish the
contributions of a whole generation of people. In addition, we shouldn't just
be concerned with attracting young people to contribute, the same changes will
have likely reduced the chances that people of all ages will get involved.</p>
<p>Aside from that though there is much to discuss. You talk about the changes in
Free Software since you got involved, and it mirrors my observations. While these
changes may have forced fewer people to learn all the details of how the system
works, they have certainly allowed more people to use the software, bringing many
different skills to the party with them.</p>
<p>I would contend that often the experience for those looking to do the compilation
that you rate as important has parallels to the experience of just using the software
you present from a few years ago. If we can change that experience as much as we
have the installation and first use experience then we will empower more people to
take part in those activities.</p>
<p>It is instructive then to look at how the changes came about to see if there are
any pointers for us. I think there are two causes of the change that are of interest
to this discussion.</p>
<p>Firstly, one change has been an increased focus on user experience. Designing
and building software that serves the users needs has made it much more palatable
for people, and reduced the investment that people have to make before using it.
In the same way I think we should focus on developer experience, making it more
pleasant to perform some of the tasks needed to be a hobbyist. Yes, this means
hiding some of the complexity to start with, but that doesn't mean that it can't
be delved in to later. Progressive exposure will help people to learn by not
requiring them to master the art before being able to do anything.</p>
<p>Secondly, there has been a push to make informed decisions on behalf of the user
when providing them with the initial experience. You no longer get a base system
after installation, upon which you are expected to select from the thousands of
packages to build your perfect environment. Neither are you led to download multiple
CDs that contain the entire contents of a distribution, much of which is installed
by default. Instead you are given an environment that is already equipped to do
common tasks, where each task is covered by an application that has been selected
by experts on your behalf.</p>
<p>We should do something similar with developer tools, making opinionated decisions
for the new developer, and allowing them to change things as they learn, similar
to the way in which you are still free to choose from the thousands of packages
in the distribution repositories. Doing this makes documentation easier to write,
allows for knowledge sharing, and reduces the chances of paralysis of choice.</p>
<p>There are obviously difficulties with this given that often the choice of tool
that one person makes on a project dicatates or heavily influences the choice
other people have to make. If you choose autotools for your projects then I can't
build it with CMake. Our development tools are important to us as they shape
the environment in which we work, so there are strong opinions, but perhaps
consistency could become more of a priority. There are also things we can do
with libraries, format specifications and wrappers to allow choice while still
providing a good experience for the fledgling developer.</p>
<p>Obviously as we are talking about free software the code will always be available,
but that isn't enough in my mind. It needs to be easier to go from code to
something you can install and remove, allowing you to dig deeper once you have
achieved that.</p>
<p>I believe that our effort around things like <a class="reference external" href="https://dev.launchpad.net/BuildBranchToArchive">https://dev.launchpad.net/BuildBranchToArchive</a>
will go some way to helping with this.</p>
</div>

]]></description>
</item>

<item>
  <title>Summer of Code Student Application Deadline</title>
  <link>http://jameswestby.net/weblog/ubuntu/18-summer-of-code-student-application-deadline.html</link>
  <description><![CDATA[
<div class="document">
<p>The deadline for students to submit their applications to Google for Summer
of Code is imminent.</p>
<p>If you were waiting for the last minute to submit, that is now!</p>
<p>If you are mentor and have the perfect student you have been working with,
check with them that they have submitted the application to Google, otherwise
you will be stuck.</p>
<p>Next week we'll start to process the huge number of applications that we
have for Ubuntu.</p>
</div>

]]></description>
</item>

<item>
  <title>Caution: python-multiprocessing, threads and glib don&apos;t mix</title>
  <link>http://jameswestby.net/weblog/tech/14-caution-python-multiprocessing-and-glib-dont-mix.html</link>
  <description><![CDATA[
<div class="document">
<p>If you don't want to read this article, then just steer clear of
python-multiprocessing, threads and glib in the same application. Let me
explain why.</p>
<p>There's a rather <a class="reference external" href="https://bugs.edge.launchpad.net/ubuntu/+source/gwibber/+bug/554005">famous bug</a> in <a class="reference external" href="https://launchpad.net/gwibber">Gwibber</a> in Ubuntu Lucid, where
a gwibber-service process will start taking 100% of the CPU time
of one of your cores if it can. While looking in to why this bug
happened I learnt a lot about how multiprocessing and GLib work,
and wanted to record some of this so that others may avoid the
bear traps.</p>
<p>Python's <a class="reference external" href="http://docs.python.org/library/multiprocessing.html">multiprocessing module</a> is a nice module to allow you to
easily run some code in a subprocess, to get around the restriction of
the GIL for example. It makes it really easy to run a particular function
in a subprocess, which is a step up from what you had to do before it
existed. However, when using it you should be aware how the way it works
can interact with the rest of your app, because there are some possible
nasties lurking there.</p>
<p><cite>GLib</cite> is a set of building blocks for apps, most notably used by GTK+.
It provides an object system, a mainloop and lots more besides. What we are
most interested here is the mainloop, signals, and thread integration that
it provides.</p>
<p>Let's start the explanation by looking at how multiprocessing does its thing.
When you start a subprocess using multiprocessing.Process, or something that
uses it, it causes a fork(2), which starts a new process with a copy of the
programs current memory, with some exceptions. This is really nice for
multiprocessing, as you can just run any code from that program in the
subprocess and pass the result back without too much difficulty.</p>
<p>The problems occur because there isn't an exec(3) to accompany the fork(2).
This is what makes multiprocessing so easy to use, but doesn't insert a clean
process boundary between the processes. Most notably for this example, it
means the child inherits the file descriptors of the parent (critically even
those marked FD_CLOEXEC).</p>
<p>The other piece to this puzzle is how the GLib mainloop communicates
between threads. It requires some mechanism where one thread can alert
another that something of interest happened. To do this when you tell
GLib that you will be using threads in your app by calling g_thread_init
(gobject.threads_init() in Python) then it will create a pipe for use by
glib to alert other threads.  It also creates a watcher thread that
polls one end of this pipe so that it can act when a thread wishes to
pass something on to the mainloop.</p>
<p>The final part of the puzzle is what your app does in a subprocess with
mutliprocessing. If you purely do something such as number crunching
then you won't have any issues. If however you use some glib functions
that will cause the child to communicate with the mainloop then you
will see problems.</p>
<p>As the child inherits the file descriptors of the parent it will use the
same pipe for communication. Therefore if a function in the child writes
to this pipe then it can put the parent in to a confused state. What
happens in gwibber is that it uses some gnome-keyring functions and that
puts the parent in to a state where the watcher thread created by
g_thread_init busy-polls on the pipe, taking up as much CPU time as it can
get from one core.</p>
<p>In summary, you will see issues if you use python-multiprocessing from
a thread and use some glib functions in the children.</p>
<p>There are some ways to fix this, but no silver bullet:</p>
<blockquote>
<ul class="simple">
<li>Don't use threads, just use multiprocessing. However, you can't
communicate with glib signals between subprocesses, and there's
no equivalent built in to multiprocessing.</li>
<li>Don't use glib functions from the children.</li>
<li>Don't use multiprocessing to run the children, use exec(3) a script
that does what you want, but this isn't as flexible or as
convenient.</li>
</ul>
</blockquote>
<p>It may be possible to use the support for different GMainContexts for
different threads to work around this, but:</p>
<blockquote>
<ul class="simple">
<li>You can't access this from Python, and</li>
<li>I'm not sure that every library you use will correctly implement it,
and so you may still get issues.</li>
</ul>
</blockquote>
<p>Note that none of the parties here are doing anything particularly
wrong, it's a bad interaction caused by some decisions that are known to
cause issues with concurrency. I also think there are issues when using
DBus from multiprocessing children, but I haven't thoroughly
investigated that. I'm not entirely sure why the multiprocessing child
seems to have to be run from a non-main thread in the parent to trigger
this, any insight would be welcome. You can find a small script to
reproduce the problem <a class="reference external" href="http://jameswestby.net/scratch/multiprocessing_bug.py">here</a>.</p>
<p>Or, to put it another way, global state bad for concurrency.</p>
</div>

]]></description>
</item>

<item>
  <title>Looking for Summer of Code students</title>
  <link>http://jameswestby.net/weblog/ubuntu/17-looking-for-summer-of-code-students.html</link>
  <description><![CDATA[
<div class="document">
<p>As you've probably heard by now, <a class="reference external" href="https://wiki.ubuntu.com/GoogleSoC2010">Ubuntu has been accepted to Google Summer of Code
this year.</a> We're currently at the point where we are looking for students to
take part and the mentors to pair with them to make the proposal. We have some <a class="reference external" href="https://wiki.ubuntu.com/GoogleSoC2010/Ideas">ideas
on the wiki</a>, but there's nothing to stop you coming up with your own if you have a
great idea. The only requirement is that you find a mentor to help you with it.</p>
<p>The best way to do this is to write up a proposal on your wiki page on the Ubuntu wiki,
and then to email the <cite>Ubuntu Summer of Code mailing list</cite> about it. You can also
ask for possible mentors on IRC and on other Ubuntu mailing lists related to your idea.</p>
<p>I have a couple of ideas on the wiki page, but I am happy to consider ideas from
students that fall in my area of expertise.</p>
<p>I spend most of my time working on developer tools and infrastructure. These are
things that users of Ubuntu won't see, but are used every day by developers of
Ubuntu. Improvements we can make in this area can in turn improve Ubuntu by giving
us happier, more productive, developers. It's also an interesting area to work in,
as there are usually different constraints to developing user software, as developers
have different demands.</p>
<p>If you think that sounds interesting and you have a great idea that falls in to that
area, or you like one of my ideas on the wiki page, then get in touch with me. I will
be happy to discuss your ideas and help you flesh them out in to a possible proposal,
but I won't be able to mentor everyone.</p>
<p>I would consider mentoring any idea that either improved existing tools used by
Ubuntu developers (bzr, pbuilder, devscripts, ubuntu-dev-tools, etc.)
or created a new one that would make things easier. In the same spirit, anything
that makes it easier for someone to get started with Ubuntu development, such as
Harvest, helpers for creating packages, etc. could be a possible project. The last
category would be infrastructure-type projects such as the idea to automate
test-merging-and-building of new upstreams, or similar ideas.</p>
<p>I've also posted about some ideas that I would like to see previously on my blog,
which might be a source of inspriation.</p>
<p>If this interests you then you can find out how to contact me on my <a class="reference external" href="https://launchpad.net/~james-w">Launchpad profile.</a></p>
</div>

]]></description>
</item>

<item>
  <title>Some of my best friends are Unicorns</title>
  <link>http://jameswestby.net/weblog/tech/13-some-of-my-best-friends-are-unicorns.html</link>
  <description><![CDATA[
<div class="document">
<p>As my contribution to Ada Lovelace Day 2010 I would like to mention
<a class="reference external" href="http://emmajane.net">Emma Jane Hogbin</a>.</p>
<p>Emma is an Ubuntu Member, published author, documentation evangelist,
conference organiser, Drupal theming expert, tireless conference presenter,
and many more things as well.</p>
<p>Her enthusiasm is infectious, and her passion for solving problems for people
is admirable. She is a constant source of inspiration to me, and that continues
even as she <a class="reference external" href="http://www.emmajane.net/happy-birthday-agnes-ones-you">branches out in to new things</a>.</p>
<p>(Hat tip for the title to the ever excellent Sharrow)</p>
</div>

]]></description>
</item>

<item>
  <title>The Bazaar Package Importer</title>
  <link>http://jameswestby.net/weblog/ubuntu/16-bazaar-packge-importer.html</link>
  <description><![CDATA[
<div class="document">
<p>The Bazaar package importer is a service that we run to allow people to use
Bazaar for Ubuntu development by importing any source package uploads in to
bzr. It's not something that most Ubuntu developers will interact with directly,
but is of increasing importance.</p>
<p>I've spent a lot of time working in the background on this project, and while
the details have never been secret, and in fact the code has been available
for a while, I'm sure most people don't know what goes on. I wanted to rectify
that, and so started with some <a class="reference external" href="https://wiki.ubuntu.com/DistributedDevelopment/UnderTheHood">wiki documentation</a> on the internals. This post
is more abstract, talking about the archtecture.</p>
<p>While it has a common pattern of requirements, and so those familiar with
the architecture of job systems will recognise the solution, the devil is in the
details. I therefore present this as a case-study of one such system that can
be used to constrast other similar sytstems as an aid to learning how differing
requirements affect the finished product.</p>
<div class="section" id="the-problem">
<h1>The Problem</h1>
<p>For the <a class="reference external" href="https://wiki.ubuntu.com/DistributedDevelopment">Ubuntu Distributed Development initative</a> we have a need for a
process that imports packages in to bzr on an ongoing basis as they are
uploaded to Ubuntu. This is so that we can have a smooth transition rather
than a flag day where everyone switches. For those that are familiar with
them think Launchpad's code imports but with Debian/Ubuntu packages as the
source, rather than a foreign VCS.</p>
<p>This process is required to watch for uploads to Debian and Ubuntu and trigger
a run to import that upload to the bzr branches, pushing the result to LP. It
should be fast, though we currently have a publication delay in Ubuntu that
means we are used to latencies of an hour, so it doesn't have to be greased
lightning to get acceptance. It is more important to be reliable, so that the
bzr branches can be assumed to be up to date, that is crucial for acceptance.</p>
<p>It should also keep an audit trail of what it thinks is in the branches. As
we open up write access to the resulting branches to Ubuntu developers we can
not rely on the content of the branches not being tampered with. I don't expect
this will ever be a problem, but I wanted to ensure that we could at least detect
tampering, even if we couldn't know exactly what had happened by keeping private
copies of everything.</p>
</div>
<div class="section" id="the-building-blocks">
<h1>The Building Blocks</h1>
<p>The first building block of the solution is the import script for a single package.
You can run this at any time and it will figure out what is unimported, and do
the import of the rest, so you can trigger it as many times as you like without
worrying that it will cause problems. Therefore the requirement is only to trigger
it at least once when there has been an upload since the last time it was run, which
is a nicer requirement than &quot;exactly once per upload&quot; or similar.</p>
<p>However, as it may import to a number of branches (both lucid and karmic-security
in the case of a security upload, say), and these must be consistent on Launchpad,
only one instance can run at once. There is no way to do atomic operations on sets
of branches on Launchpad, therefore we use locks to ensure that only one process
is running per-package at any one time. I would like to explore ways to remove this
requirement, such as avoiding race conditions by operating on the Launchpad branches
in a consistent manner, as this would give more freedom to scale out.</p>
<p>The other part of the system is a driver process. We use separate processes so that
any faults in the import script can be caught in the supervisor process, with the
errors being logged. The driver process picks a package to import and triggers a run
of the script for it. It uses something like the following to do that:</p>
<pre class="literal-block">
write_failure(package, &quot;died&quot;)
try:
    import(package)
except:
    write_failure(packge, stderr)
finally:
    remove_failure(package)
</pre>
<p>write_failure creates a record that the package failed to import with a reason. This
provides a list of problems to work through, and also means that we can avoid trying
to import a package if we know it has failed. This ensures that previous failures are
dealt with properly without giving them a chance to corrupt things later.</p>
</div>
<div class="section" id="queuing">
<h1>Queuing</h1>
<p>I said that the driver picks a package and imports it. To do this it simply queries
the database for the highest priority job waiting, dispatching the result, or
sleeping if there are no waiting jobs. It can actually dispatch multiple jobs in
parallel as it uses processes to do the work.</p>
<p>The queue is filled by a couple of other processes triggered by cron. This is useful
as it means that further threads are not required, and there is less code running
in the monitor process, and so less chance that bugs will bring it down.</p>
<p>The first process is one that checks for new uploads since the last check and adds a
job for them, see below for the details. The second is one that looks at the current
list of failures and retries some of them automatically, if the failure looks like it
was likely to be transient, such as a timeout error trying to reach Launchpad. It
only retries after a timeout of a couple of hours has elapsed, and also if that package
hasn't failed in that same way several times in a row (to protect against e.g. the data
that job is sending to LP causing it to crash and so give timeout errors.)</p>
<p>It may be better to use an AMQP broker or a job server such as Gearman for this task,
rather that just using the database. However, we don't really need any of the more
advanced features that these provide, and already have some degree of loose-coupling,
so using fewer moving parts seems sensible.</p>
</div>
<div class="section" id="reacting-to-new-uploads">
<h1>Reacting to new uploads</h1>
<p>I find this to be a rather neat solution, thanks to the Launchpad team. We use
the API for this, notably a method on IArchive called getPublishedSources().
They key here is the parameter &quot;created_since_date&quot;. We keep track of this and
pass it to the API calls to get the uploads since the last time we ran, and
then act on those. Once we processed them all then we update the stored
date and go around again.</p>
<p>This has some nice properties, it is a poll interface, but has some things in
common with an event-based one. Key in my eyes is that we don't have to have
perfect uptime in order to ensure we never miss events.</p>
<p>However, I am not convinced that we will never get a publication that appears
later than one that we have dealt with, but that reports an earlier time.
If this happens we would never see it. The times we use always come from
LP, so don't require synchronised clocks between the machine where this
runs and the LP machines, but it could still happen inside LP.
To avoid this I subtract a delta when I send the request, so assuming
the skew would not be greater than that delta we won't get hit. This does
mean that you repeatedly try and import the same things, but that
is just a mild inefficiency.</p>
</div>
<div class="section" id="synchronisation">
<h1>Synchronisation</h1>
<p>There is a synchronisation point when we push to Launchpad. Before and after
this critical period we can blow away what we are doing with no issues. During
it though we will have an inconsistent state of the world if we did that.
Therefore I used a protocol to ensure that we guard this section.</p>
<p>As we know locking ensures that only one process runs at a time, meaning that
the only way to race is with &quot;yourself.&quot; All the code is written to assume
that things can go down at any time as I said, the supervisor catches this
and marks the failures, and even guards against itself dying. Therefore
when it picks back up and restarts the jobs that it was processing before
dying it needs to ensure that it wasn't in the critical section.</p>
<p>To do this we use a three-phase commit on the audit data to accomany the push.
When we are doing the import we track the additions to the audit data separately
from the committed data. Then if we die before we reach the critical section
we can just drop it again, returning to the inital state.</p>
<p>The next phase marks in the database that the critical section has begun. We then
start the push back. If we die here we know we were in the critical section and can
restart the push. Only once the push has fully completed do we move the new audit
data in to place.</p>
<p>The next step cleans up the local branches, dying here means we can just carry
on with the cleanup. Finally the mark that we are in the critical section is
removed, and we are back to the start state, indicating that the last run was
clean, and any subsequent run can proceed.</p>
<p>All of this means that if the processes go down for any reason, they will clean
up or continue as they restart as normal.</p>
</div>
<div class="section" id="dealing-with-launchpad-api-issues">
<h1>Dealing with Launchpad API issues</h1>
<p>The biggest area of operational headaches I have tends to come from using the
Launchpad API. Overall the API is great to have, and generally a pleasure to
use, but I find that it isn't as robust as I would like. I have spent quite
some time trying to deal with that, and I would like to share some tips from
my experience. I'm also keen to help diagnose the issues further if any Launchpad
developers would like so that it can be more robust off the bat.</p>
<p>The first tip is: partition the data. Large datasets combined with fluctuating
load may mean that you suddenly hit a timeout error. Some calls allow you to
partition the data that you request. For instance, getPublishedSources that
I spoke about above allows you to specify a distro_series parameter. Doing</p>
<pre class="literal-block">
distro.main_archive.getPublishedSources()
</pre>
<p>is far far more likely to timeout than</p>
<pre class="literal-block">
for s in distro.series:
    distro.main_archive.getPublishedSources(distro_series=s)
</pre>
<p>in fact, for Ubuntu, the former is guaranteed to timeout, it is a lot of data.</p>
<p>This is more coding, and not the natural way to do it, therefore it would be
great if launchpadlib automatically partioned and recombined the data.</p>
<p>The second tip is: expect failure. This one should be obvious, but the API doesn't
make it clear, unlike something like python-couchdb. It is a webservice, so you
will sometimes get HTTP exceptions, such as when LP goes offline for a rollout.
I've implemented randomized exponential backoff to help with this, as I tend
to get frequent errors that don't apparently correspond to service issues.
I very frequently see 502 return codes, on both edge and production, which I believe
means that apache can't reach the appservers in time.</p>
</div>
<div class="section" id="summary">
<h1>Summary</h1>
<p>Overall, I think this architecture is good, given the synchronisation requirements
we have for pushing to LP, without those it could be more loosely coupled.</p>
<p>The amount of day-to-day hand-holding required has reduced as I have learnt about
the types of issues that are encountered and changed the code to recognise and act
on them.</p>
</div>
</div>

]]></description>
</item>

<item>
  <title>Dry Rub Barbeque Trout</title>
  <link>http://jameswestby.net/weblog/food/04-dry-rub-barbeque-trout.html</link>
  <description><![CDATA[
<div class="document">
<p>Made this up after buying a nice piece of locally caught freshwater trout.
I think that it would be even better if you were to hot-smoke it.
Apply the rub between two and twelve hours before cooking.</p>
<p>Mix up the following then rub on to the flesh of the fish (enough for
four servings):</p>
<blockquote>
<ul class="simple">
<li>1 tbsp sea/rock salt.</li>
<li>1 tbsp black peppercorns crushed.</li>
<li>1 tbsp ground cumin.</li>
<li>1 tbsp ground coriander.</li>
<li>2 tsp caraway seed.</li>
<li>2 tsp dried tarragon.</li>
<li>2 tsp dried thyme.</li>
<li>2 tsp chilli powder.</li>
<li>Zest of one lemon.</li>
</ul>
</blockquote>
<p>To drizzle on top when cooked melt some butter in a pan, add the
juice of the lemon you used above, a pinch of salt, one crushed clove
of garlic, and a handful of chopped coriander. Simmer for a couple of
minutes.</p>
<p>Enjoy!</p>
</div>

]]></description>
</item>

<item>
  <title>Project Cambria</title>
  <link>http://jameswestby.net/weblog/ubuntu/15-project-cambria.html</link>
  <description><![CDATA[
<div class="document">
<p><a class="reference external" href="http://davidsiegel.org/improving-bug-workflow-for-opportunistic-programmers/">David</a>, it's interesting that you posted about that, as it's something I've been toying with
for the last couple of years. For the last few months I've been (very) slowly experimenting
in my free time with an approach that I think works well, and I think it's time to tell more
people about it and to ask for contributions.</p>
<p>Opportunistic programmers are useful to cater for here, as Debian/Ubuntu development isn't trivial,
and so we are simplifying something existing, which means that it will still be powerful, which
is also important. I'm not only interested in improving the experience for the opportunistic
programmer though, why should they get all the cool stuff? I'm interested in producing something
that I can use for doing Ubuntu development too (though not every last detail).</p>
<p>The project I am talking about has been christened &quot;cambria&quot; and is now <a class="reference external" href="https://launchpad.net/cambria">available on Launchpad</a>.
It's a library that aims to provide great APIs for working with packages throughout the lifecycle,
including things like Bazaar, PPAs, local builds, testing, lintian, etc. It should be pleasurable
to use and also allow you to build tools on top that are also pleasurable. It should also allow
for easy extension in to different GUI toolkits and for command-line tools, though I've only been
working with GTK so far.</p>
<p>In addition, there is a <a class="reference external" href="https://launchpad.net/gedit-ubudev">gedit plugin</a> that allows you to perform common tasks from within gedit.
I chose gedit as it has a pleasant Python API for plugins, isn't so complicated that it takes much
learning, and will already be installed on most Ubuntu desktop systems. As I said though, the libarary
allows you to implement in anything you like (that can use a python library.)</p>
<p>I've put together some mockups that suggest some of the things that I would like to do:</p>
<img alt="A mockup of an inteface for building packages within gedit. There is a button to build the active package, and a box that shows the output of the build." src="/images/build-thumb.png" />
<p><a class="reference external" href="/images/build.png">Build</a></p>
<img alt="A mockup of an inteface for jumping to work on packages already downloaded in gedit. There is a list of packages that have previously been worked on, and the user can choose any to open a dialog of the contents of that package to choose a file to edit from within." src="/images/package-list-thumb.png" />
<p><a class="reference external" href="/images/package-list.png">Package list</a></p>
<img alt="A mockup of an inteface for downloading the source of packages within gedit. The main point conveyed is that the user should be asked what they intend to work on (bug fix, merge, etc.) so that the tools can do some of the work for them, and wizards and the like can be used to do the rest." src="/images/download-thumb.png" />
<p><a class="reference external" href="/images/download.png">Download</a></p>
<p><a class="reference external" href="http://bazaar.launchpad.net/~cambria-dev/cambria/trunk/annotate/head:/RATIONALE">The RATIONALE file</a> includes some more reasons for the project:</p>
<blockquote>
<p>Project cambria is about wrapping the existing tools for Debian/Ubuntu
development to allow a more task-based workflow. Depending on the task the
developer is doing there may be several things that must be done, but they
must currently work each one out individually. We have documentation to help
with this, but it's much simpler if your tools can take care of it for you.</p>
<p>Project cambria aims to make Ubuntu development easier to get started with.
There are several ways that it will help. Providing a task-based workflow
where you are prompted for the information that is needed to complete the
task, and other things are done automatically, or defaults chosen helps as
it means you can concentrate on completing the task, rather than learning
about all the possible changes you could make and deciding which applies.</p>
<p>Project cambria aims to make Ubuntu development easier for everyone by
automating common tasks, and alleviating some of the tool tax that we pay.
It won't just be a beginner tool, but will provide tools and APIs that
experienced developers can use, or can build upon to build tools that suit
them.</p>
<p>Project cambria will help to take people from novice to experienced
developer by providing documentation that allows you to learn about the
issues related to your current task. This provides an easier way in to the
documentation than a large individual document (but it can still be read
that way if you like).</p>
<p>Project cambria will make Ubuntu development more pleasurable by focusing
on the user experience. It will aim to pull together disparate interfaces
in to a single pleasing one. Where it needs to defer to a different interface
it should provide the user with an explanation of what they will be seeing
to lessen the jarring effect.</p>
</blockquote>
<p>I'm keen for others to contribute, there is some information about this in
<a class="reference external" href="http://bazaar.launchpad.net/~cambria-dev/cambria/trunk/annotate/head:/CONTRIBUTING">the project's CONTRIBUTING file</a>. I'm looking for all sorts of contributions
from all kinds of people and keen to help you get started if you aren't confident
with the type of contribution you would like to make.</p>
<p>There's a mailing list as part of <a class="reference external" href="https://launchpad.net/~cambria">the ~cambria team</a> on Launchpad and IRC channel
if you are interested in discussing it more.</p>
</div>

]]></description>
</item>

<item>
  <title>Ubuntu Distributed Development Overview</title>
  <link>http://jameswestby.net/weblog/ubuntu/14-distributed-development-overview.html</link>
  <description><![CDATA[
<div class="document">
<p>You may well have heard about it (on this blog especially),
but though I spend lots of my time involved with it and talking
to people about it, there may be some people who aren't
entirely sure what we are doing with the Ubuntu Distributed
Development initiative, or what we are trying to achieve.
To try and help this I wrote up an overview of what we are
doing.</p>
<p>If this project interests you and you would like to help, or
just observe, then you can <a class="reference external" href="https://lists.ubuntu.com/mailman/listinfo/ubuntu-distributed-devel">subscribe to the mailing list</a>.
There's lots of fun projects that you could take on: there's
far more that is possible and would be hugely useful to Ubuntu
developers than we can currently work on. If you want to work
on something then feel free to talk to me about it and we
can see if there is something that would suit you.</p>
<p>Without further ado...</p>
<div class="section" id="the-aim">
<h1>The aim</h1>
<p>The TL;DR version:</p>
<blockquote>
<ol class="arabic simple">
<li>Version Control rocks.</li>
<li>Distributed version control rocks even more.</li>
<li>Bazaar rocks particularly well.</li>
<li>Let's use Bazaar for Ubuntu.</li>
</ol>
</blockquote>
<p>Or, if you prefer a more verbose version...</p>
<p>Ubuntu is a global project with many people contributing to the development
of it in many ways. In particular development/packaging involves many people
working on packages, and much of this requires more than one person to work
on the change that it is being made, for e.g.</p>
<blockquote>
<ol class="arabic simple">
<li>Working on the problem together</li>
<li>Sponsoring</li>
<li>Other review</li>
</ol>
</blockquote>
<p>etc.</p>
<p>These things usually require the code to be passed backwards and forwards,
and in particular, merged. In addition, we sometimes have to do things
like merge the patch in the bug with a later version of the Ubuntu package.
In fact, Ubuntu is a derivative of Debian, and we expend a huge effort
every cycle merging the two.</p>
<p>Distributed version control systems have to be good at merging, it's a
fundamental property. We currently do without, but we have tools such
as MoM that use version control techniques to help us with some of the
merging. We could carry on in this fashion, or we could move to use
a distributed version control system and make use of its features, and
gain a lot of other things in the process.</p>
<p>Tasks such as viewing history, and annotating to find who made a particular
change and why, also become much easier than when you have to download and
unpack lots of tarballs.</p>
<p>This isn't to say that there aren't costs to the transition, and tools
and processes we currently use that don't currently have an obvious
analogue in the bzr world. That just means we have to identify those
things and put the work in to provide an alternative, or to port, where
it makes sense.</p>
<p>The aim is therefore to help make Ubuntu developers more productive, and
enable us to increase the number of developers, by making use of modern
technologies, in particular Bazaar, though there are several other
things that are also being used to do this.</p>
</div>
<div class="section" id="what-it-isn-t">
<h1>What it isn't</h1>
<p>This isn't a project to overhaul all the Ubuntu development tools. While
there are many things I would like to fix about some of our tools (see
some of the things that Barry had to get his head around in the &quot;First
Impressions&quot; thread), that can go ahead without having to tie it in to
this project. I hope that when me make some common tasks easier, it will
focus attention on others that are still overly complex, and encourage
people to work on those too.</p>
<p>We are not replacing the entire stack. We are building upon the lower
layers, and replacing some of the higher ones. We aim for compatibility
where possible, and not breaking existing workflows until it makes
sense.</p>
</div>
<div class="section" id="the-plan">
<h1>The plan</h1>
<p>You can read the original overall specification for this work at</p>
<blockquote>
<a class="reference external" href="https://wiki.ubuntu.com/DistributedDevelopment/Specification">https://wiki.ubuntu.com/DistributedDevelopment/Specification</a></blockquote>
<p>It is rather dry and lacking in commentary, and also a little out
of date as we drill down in to each of the phases. Therefore I'll
say a little more about the plan here.</p>
<p>The plan is to work from the end of the Ubuntu developers, converting
the things that we work most directly with first. This should give the
biggest impact. We will then work to pull in other things that improve
the system.</p>
<p>This means that we start by making all packages available in bzr, and
make it possible to use bzr to do packaging tasks. In addition to this
we are working with the LP developers to make it possible for Soyuz to
build a source package from the branch, so that you don't have to leave
bzr to make a change to a package. This work is underway.</p>
<p>After that we make all of Debian available in bzr in the same way. This
allows us to merge from Debian directly in bzr. At a first cut, this
just allows us to replace MoM, but in fact allows for more than that.
Have a conflict? You have much more information available as to why
the changes were made, which should help when deciding what to do.</p>
<p>The next step after that is to also bring the Vcs-* branches in to the
history. These are the branches used by the Debian maintainer, and so
allow you to work directly with the Debian maintainer without switching
out of the system that you have learnt.</p>
<p>In a similar way we then want to pull in the upstream branches
themselves. Again, this will allow you to work closely with upstream,
without having to step out of the normal workflow you know.</p>
<p>The last point deserves some more explanation. The idea is that you
will be able to grab a package as you normally do, work on a patch,
and then when you are happy run a command or three that does something
like the following:</p>
<blockquote>
<ul class="simple">
<li>Merges your change in to the tip of upstream, allowing you to
resolve any conflicts.</li>
<li>Provide a cover letter for the change (seeded with the changelog
entry and/or commit message(s).</li>
<li>Send the change off to upstream in their preferred format and
location (LP merge proposal, patch in the bugtracker, mailing list
etc.)</li>
</ul>
</blockquote>
<p>As you can imagine, there are a fair number of prerequisites that we
need to complete before we can get to that stage, but I think of that
as the goal. This will smooth some of the difficulties that arise in
packaging from having to deal with a variety of upstreams. Finding the
upstream VCS, working out their preferred form and location for
submission, rebasing your change on their tip etc. I hope this will
make Ubuntu developers more efficient, make forwarding changes
easier to do and do well, and save new contributors from having to
learn too many things at once.</p>
</div>
<div class="section" id="where-we-are-now">
<h1>Where we are now</h1>
<p>We currently have all of Ubuntu imported (give or take), you can</p>
<blockquote>
bzr branch lp:ubuntu/&lt;source package name&gt;</blockquote>
<p>which is great in itself for many people.</p>
<p>We also have all of Debian imported, and similarly available with</p>
<blockquote>
bzr branch lp:debian/&lt;source package name&gt;</blockquote>
<p>which naturally allows</p>
<blockquote>
bzr merge lp:debian/&lt;source package name&gt;</blockquote>
<p>so you can make use of that right now.</p>
<p>We are also currently looking at the sponsorship process around
bzr branches, and once we have that cracked it will be much easier
for upstream developers who know bzr to submit a bugfix, and that's
a large constituency.</p>
<p>In addition, this means that a new contributor can start without
having to learn debdiff etc., we can pass code around without having
to merge two diffs and the like.</p>
<p>This is great in itself, but we are still some way from the final
goal.</p>
<p>We are currently working on the VCS-* branches, to make them mergeable,
but their are a number of prerequisites.</p>
<p>In addition the Launchpad team are also working on making it possible
to build from a branch.</p>
</div>
<div class="section" id="where-we-can-go">
<h1>Where we can go</h1>
<p>As I said, building on top of bzr makes a number of things easier.</p>
<p>For instance, once LP can build from branches, we could have a MoM-a-like
that very cheaply tries to merge from Debian every time there is an
upload there, and if it succeeds build the package. This could then
tell you not only if there were any conflicts in the merge, but any
build failures, even before you download the code.</p>
<p>In addition, we are currently talking a lot about Daily Builds, building
the latest code every day (or commit, week, whatever). There are a number
of things this brings. It doesn't strictly require version control, but
as it's basically a merging problem having everything in Bazaar makes it
much easier to do. We have a system now built on &quot;recipes&quot; that we are
working to add to LP.</p>
</div>
<div class="section" id="parts-of-the-work">
<h1>Parts of the work</h1>
<p>There are a number of parts to the work, and you will see these and
others being discussed on the list:</p>
<blockquote>
<ul class="simple">
<li>bzr (obviously), which we sometimes need to change to make this work
possible, either bug fixes, or sometimes new features.</li>
<li>bzr-builddeb, which is a bzr plugin that knows how to go from branch
to package and vice-versa.</li>
<li>bzr-builder, the bzr plugin that implements &quot;recipes.&quot;</li>
<li>Launchpad, which hosts the branches, provides the merge prosals, and
will allow building from branches and daily builds.</li>
<li>The bzr importer, this is the process that mirrors the Ubuntu and Debian
archives in to bzr and pushes the branches to LP.</li>
</ul>
</blockquote>
<p>and probably others that I have forgotten right now.</p>
</div>
</div>

]]></description>
</item>

<item>
  <title>Commit access is no more</title>
  <link>http://jameswestby.net/weblog/tech/12-commit-access-is-no-more.html</link>
  <description><![CDATA[
<div class="document">
<p>Many projects that I work on, or follow the development of, and granted there may
be a large selection bias here, are showing some of the same tendencies. Combined
these indicate to me that we need to change the way we look at becoming a trusted
member of the project.</p>
<p>The obvious change here is the move to distributed version control. I'm obviously
a fan of this change, and for many reasons. One of those is the democratisation of
the tools. There is no longer a special set of people that gets to use the best
tools, with everyone else having to make do. Now you get to use the same tools
whether you were the founder of the project, or someone working on your first
change. That's extremely beneficial as it means that we don't partition our efforts
to improve the tools we use. It also means that new contributors have an easier
time getting started, as they get to use better tools. These two influences combine
as well: a long time contributor can describe how they achieve something, and the
new contributor can directly apply it, as they use the same tools.</p>
<p>This change does mean that getting &quot;commit access&quot; isn't about getting the ability
to commit anymore; everyone can commit anytime to their own branch. Some projects,
e.g. Bazaar, don't even hand out &quot;commit access&quot; in the literal sense, the project
blessed code is handled by a robot, you just get the ability to have the robot merge
a branch.</p>
<p>While it is true that getting &quot;commit access&quot; was never really about the tools,
it was and is about being trusted to shepherd the shared code, a lot of projects
still don't treat it that way. Once a developer gets &quot;commit access&quot; they just
start committing every half-cooked patch they have to trunk. The full use of
distributed version control, with many branches, just emphasises the shared
code aspect. Anyone is free to create a branch with their half-baked idea and
see if anyone else likes it. The &quot;blessed&quot; branch is just that, one that the
project as a whole decides they will collaborate on.</p>
<p>This leads to my second change, code review. This is something that I also deeply
believe in; it is vital to quality, and a point at which open source software
becomes a massive advantage, so something we should exploit to the full. I see
it used increasingly in many projects, and many moving up jml's <a class="reference external" href="http://mumak.net/stuff/your-code-sucks.html">code review
&quot;ladder&quot;</a> towards pre-merge review of every change. There seems to be increasing
acceptance that code review is valuable, or at least that it is something a good
project does.</p>
<p>Depending on the project the relationship of code review and &quot;commit access&quot; can
vary, but at the least, someone with &quot;commit access&quot; can make their code review
count. Some projects will not allow even those with &quot;commit access&quot; to act
unilaterally, requiring multiple reviews, and some may even relegate the concept,
working off votes from whoever is interested in the change.</p>
<p>At the very least, most projects will have code review when a new contributor
wishes to make a change. This typically means that when you are granted &quot;commit
access&quot; you are able or expected to review other people's code, even though
you may never have done so before. Some projects also require every contribution
to be reviewed, meaning that &quot;commit access&quot; doesn't grant you the ability to
do as you wish, it instead just puts the onus on you to review the code of others
as well as write your own.</p>
<p>As code review becomes more prevalent we need to re-examine what we see as
&quot;commit access,&quot; and how people show that they are ready for it. It may be
that the concept becomes &quot;trusted reviewer&quot; or similar, but at the least
code review will be a large part of it. Therefore I feel that we shouldn't
just be looking at a person's code contributions, but also their code review
contributions. Code review is a skill, some people are very good at it, some
people are very very bad at it. You can improve with practice and teaching,
and you can set bad examples for others if you are not careful. We will
have to make sure that review runs through the blood of a project, everyone
reviews the code of everyone else, and the reviews are reviewed.</p>
<p>The final change that I see as related is that of empowering non-code
contributors. More and more projects are valuing these contributors, and
one important part of doing that is trusting them with responsibility. It
may be that sometimes trusting them means giving them &quot;commit access&quot;,
if they are working on improving the inline help for instance. Yes, it may
be that distributed version control and code review mean that they do
not have to do this, but those arguments could be made for code contributors
too.</p>
<p>This leads me to another, and perhaps the most important, aspect of the
&quot;commit access&quot; idea: trust. The fundamental, though sometimes unspoken,
measure we use to gauge if someone should get &quot;commit access&quot; is whether
we believe them to be trustworthy. Do we trust them to introduce code without
review? Do we trust them to review other people's changes? Do we trust them
to change only those areas they are experts in, or to speak to someone
else if they are not? This is the rule we should be applying when making
this decision, and we should be sure to be aware that this is what we
are doing. There will often be other considerations as well, but this
decision will always factor.</p>
<p>These ideas are not new, and the influences described here did not create
them. However the confluence of them, and the changes that will likely
happen in our projects over the next few years, mean that we must be sure
to confront them. We must discard the &quot;commit access&quot; idea as many projects
have seen it, and come up with new responsibilities that better reflect
the tasks people are doing, the new ways projects operate, and that
reward the interactions that make our projects stronger.</p>
</div>

]]></description>
</item>

<item>
  <title>Kerneloops enabled by default in Karmic</title>
  <link>http://jameswestby.net/weblog/ubuntu/13-kerneloops.html</link>
  <description><![CDATA[
<div class="document">
<p>One of the new things that is going to be in karmic is that
the kerneloops daemon will be installed and running by default.
This tool, created by Arjan van de Ven, watches the kernel logs for
problems. It has a companion service, <a class="reference external" href="http://kerneloops.org/">kerneloops.org</a> which aggregates
reports of these problems, and can sort by kernel version and the like.
This allows kernel developers to spot the most commonly encountered
problems, areas of the code which are prone to bugs etc. When the
kerneloops daemon catches a problem it allows you to send the
problem to kerneloops.org.</p>
<p>We however, are not using the applet that comes with kerneloops to do
this, we are making use of the brilliant Apport. There are a couple
of reasons for this. We also want to make it easy for you to report
these issues as bugs to Launchpad, and we don't want to prompt you
with two different interfaces to do that.</p>
<p>The changes mean that if your machine has a kernel issue you will
get an apport prompt as usual. As well as asking if you would like
to report the problem to Launchpad like it does for other crashes
it will ask if you would also like to report it to kerneloops.org.
Passing the information through apport means that it can also be used
on servers as well without running X.</p>
<p>Hopefully you will never see this improvement, but it's now going to
be there for when those bugs do creep in.</p>
</div>

]]></description>
</item>

<item>
  <title>Command support in bzr-builder</title>
  <link>http://jameswestby.net/weblog/ubuntu/12-command-support-in-bzr-builder.html</link>
  <description><![CDATA[
<div class="document">
<p>I've just implemented the most requested feature in bzr-builder
(Hi <a class="reference external" href="http://gould.cx/ted/blog">Ted</a>), command support.</p>
<p>Sometimes you need to run a particular command to prepare a branch
of your project for packaging (e.g. autoreconf). I think this should
generally go in your build target, but not everyone agrees, and
sometimes there is just no other way.</p>
<p>Therefore I added a new instruction to bzr-builder recipes, &quot;run&quot;.
If you put</p>
<!--  -->
<blockquote>
run some command here</blockquote>
<p>in your recipe then it will run &quot;some command here&quot; at that point
when assembling.</p>
<p>Note that running commands that require arbitrary network access
is still to be discouraged, as you don't know in what environment
someone may assemble the recipe. I'd also advise against using
commands unless you really need them, but that's obviously your
call.</p>
</div>

]]></description>
</item>

<item>
  <title>Distributed Development Video</title>
  <link>http://jameswestby.net/weblog/ubuntu/12-distributed-development-video.html</link>
  <description><![CDATA[
<div class="document">
<p>I recently gave a talk to some fellow Canonical employees about where we are
with the &quot;Distributed Development&quot; project. For that I made a screencast showing
some of the Launchpad Codehosting features that you can now use for Ubuntu. Thanks
to the Launchpad team for making this happen. We're still ironing out the
remaining kinks that make it a pain to use, and getting all the packages imported,
but it's possible to use them now.</p>
<p>The screencast has no audio unfortunately, but you can <a class="reference external" href="http://people.canonical.com/~jamesw/dd.ogv">watch it</a> and try and
guess what I was saying. There's <a class="reference external" href="https://wiki.ubuntu.com/DistributedDevelopment/Documentation">documentation available on the wiki</a> as well.</p>
<p>One of the things the video shows is how to request someone review your change,
i.e. how to get a change sponsored in to Ubuntu. I'm keen to have people test this,
as it's not something I do very often now that I am a core-dev. Therefore if
you want to help test then propose a merge and set the appropriate sponsor
team as the reviewer, and I will prioritise it and you can give me feedback in
return.</p>
<p>Note that a bug in Launchpad means that I won't get a notification when they are
created, so feel free to drop me a line via email or on IRC until that bug is
fixed next month. I'll continue to poll the lists though, so nothing will get
dropped.</p>
</div>

]]></description>
</item>

<item>
  <title>Daily Builds</title>
  <link>http://jameswestby.net/weblog/ubuntu/11-daily-builds.html</link>
  <description><![CDATA[
<div class="document">
<p>As well as seeing use of PPAs for providing bug fixes, new upstream versions, proposed packages, testing etc., we are also seeing them used for providing daily builds of packages. For instance <a class="reference external" href="https://edge.launchpad.net/~fta">Fabien Tassin</a> provides daily builds of lots of Mozilla-related packages and snapshots of Chromium in his various PPAs. Also, there is <a class="reference external" href="http://amarok.kde.org/en/node/482">Project Neon</a>, to provide daily builds of Amarok.</p>
<p>They massively lessen the barrier to using and testing code that is fresh from the fingers of the developers. They avoid you having to build a project from source every day, making sure to keep up with changes in dependencies. They allow you to be testing code almost as it is written, speeding up the feedback cycle to the developers, and potentially increasing the number of people involved in that feedback cycle.</p>
<p>In addition they allow you to verify bugs against the latest code, so that bug reports are of more relevance to the developers. If you so choose they can also be set up so that bugs are also tested with fewer distribution patches, further increasing the developers' confidence in the bug reports.</p>
<p>Mark had an idea for an elegant way to describe how to combine the code to produce the package, and we worked on producing a tool to follow the steps. You can find the result of this in the bzr plugin <a class="reference external" href="https://launchpad.net/bzr-builder">bzr-builder</a>. I've <a class="reference external" href="https://wiki.ubuntu.com/DailyBuilds/BzrBuilder">documented how to use it</a> on the wiki.</p>
<p>There's still more we can do to improve the process, and we have a lot to discuss about what makes a good daily package, and what the limits of them are. If you are interested in discussing this then please join the list of the <a class="reference external" href="https://launchpad.net/~dailydebs-team">dailydebs team in Launchpad</a>.</p>
<p>I'm currently running the <a class="reference external" href="https://launchpad.net/~bzr-nightly-ppa/+archive/ppa">bzr-nightly-ppa</a> using this tool, and have improved some things based on this, but more testing, feedback, and patches are always welcome.</p>
</div>

]]></description>
</item>

</channel>
</rss>
