
Main Menu

User Info
 Welcome Anonymous

Top Ranked Feeds

Random Feeds
|
View the feed - Martin Fowler's Bliki
Main - Uncategorized - Feeds that are not yet Categorized - Martin Fowler's Bliki
[Comments | Printer Friendly Page | Send to a Friend | Is this your feed/content? | Feature this Feed ]|
Title: | Martin Fowler |
| Site URL: | |
| Feed URL: | http://www.martinfowler.com/feed.atom  |
| Subscribe: |
 |
| Description: | A cross between a blog and wiki of my partly-formed ideas on software development |
| Tags: | None [ Add Tags | What are Tags? ]
|
| Added on: | 25-Jul-2006 |
| Hits: | 13 |
| Rating: | N/A (0 votes) [ Rate this RSS/Atom Feed ] | Jordo Media is displaying this feed so that you can decide if you wish to subscribe to it or not. We are neither affiliated with the authors of this feed nor responsible for its content. Please report inappropriate content to via the "Report Problem" link above. |
- Steps towards REST
Last year Leonard Richardson gave a talk on QCon that included a maturity model for RESTful web services. The model is a good way to sneak up on understanding REST principles and the authors of REST in Practice are using it to help their discussion of how to use REST. Here’s my take on explaining the model, which I found helpful in my understanding of what makes REST tick:
Richardson Maturity Model: steps toward the glory of REST
- Bliki: VcsSurvey
When I discussed VersionControlTools I said that it
was an unscientific agglomeration of opinion. As I was doing it I
realized that I could add some spurious but mesmerizing numbers to
my analysis by doing a survey. Google's spreadsheet makes the
mechanics of conducting a survey really simple, so I couldn't
resist.  I conducted the survey from February 23 2010 until March 3 2010
on the ThoughtWorks software development mailing list. I got 99
replies. In the survey I asked everyone to rate a number of version
control tools using the following options: - Best in Class: Either the best VCS or equal best
- OK: Not the best, but you're OK with it.
- Problematic: You would argue that the team really ought to be using something else
- Dangerous: This tool is really bad and ThoughtWorks should press hard to have it changed
- No opinion: You haven't used it
The results were this:
| Tool | Best | OK | Problematic | Dangerous | No
Opinion | Active Responses | Approval % |
|---|
| Subversion | 20 | 72 | 6 | 1 | 0 | 99 | 93% | | git | 65 | 19 | 1 | 0 | 14 | 85 | 99% | | Mercurial | 33 | 27 | 2 | 0 | 36 | 62 | 97% | | ClearCase | 0 | 3 | 14 | 41 | 41 | 58 | 5% | | TFS | 0 | 0 | 32 | 22 | 44 | 54 | 0% | | CVS | 0 | 14 | 59 | 11 | 15 | 84 | 17% | | Bazaar | 1 | 13 | 3 | 0 | 80 | 17 | 82% | | Perforce | 1 | 26 | 16 | 1 | 54 | 44 | 61% | | VSS | 1 | 1 | 11 | 64 | 22 | 77 | 3% |
As well as the raw summary values, I've added two calculated
columns here to help summarize the results. - Active Responses: The total of responses excluding "No
Opinion". (eg for git: 65 + 19 + 1 + 0)
- Approval %: The sum of best and ok responses divided by active
responses, expressed as a percentage. (eg for git: (65 + 19) / 85)
The graph shows a scatter plot of approval percentage and active
responses. As you can see there's a clear cluster around Subversion,
git, and Mercurial with high approval and a large amount of
responses. It's also clear that there's a big divide in approval between those
three, together with Bazaar and Perforce, versus the rest. Although the graph captures the headline information well, there's
a couple of other subtleties I should mention. - Although the trio of Subversion, git, and Mercurial cluster close
together on approval, git does get a notably higher amount of best
scores: (65 versus 20 and 33).
- VSS got the most "dangerous" responses, but a couple of people
approved of it.
- Neither TFS or ClearCase are liked much, but ClearCase got more
"dangerous" responses than TFS (41 versus 22).
- Don't read too much into small differences as I'm sure they aren't
significant. I'm sure the difference in approval percentage between VSS, TFS,
and ClearCase isn't signifcant, but the difference between these three
and the leaders is.
Some caveats. This is a survey of opinion of ThoughtWorkers who
follow our internal software development discussion list, nothing more. It's
possible some of them may have been biased by my previous article
(although unlikely, since I've never managed to get my ThoughtBot
opinion-control software to work reliably). Opinions of tools are
often colored by processes that are more about the organization than
the tool itself. But despite these, I think it's an interesting data
point. I should also stress the important point to take away from this
isn't the comparison between those close in the numbers, eg comparing
git and Mercurial or comparing TFS and ClearCase. Any survey like this
has a certain amount of noise in it, and I suspect the noise here is
greater than such a difference. The important point is the big
approval gap between the leading tools (Subversion, git, and
Mercurial) and the laggards - essentially the point in
VersionControlTools.
- Bliki: ToyotaFailings
-
One of the arguments used to support the adoption of lean
techniques in software is the success of Toyota. So do Toyota's recent quality
failings undermine the case for lean software
development?
One answer for this is to take a sense of proportion. Lean
manufacturing techniques were the underpinning of Toyota's rise from
an insignificant company in the 1950's to a global giant in the
2000's. By the 1990's other car companies, and many other
manufacturers, were busily copying Toyota's techniques. The general
sense is that copying these techniques did much to raise the overall
quality of cars in the last decade or so. I would be very surprised if
the recent problems at Toyota are enough negate that half-century of
success. But a better answer is to remember that Lean manufacturing is about
manufacturing not software. The application of lean ideas to software
development is a consequence of MetaphoricQuestioning. Lean
ideas can help us come up with better ideas for software development,
and as such are valuable. But in the end their usefulness lies with
how they are used in software and they should be judged on their
record here. Their history in manufacturing, both
good and bad, is another industry.
- Bliki: BlueGreenDeployment
One of the goals that my colleagues and I urge on our clients is
that of a completely automated deployment process. Automating your
deployment helps reduce the frictions and delays that crop up in
between getting the software "done" and getting it to realize its
value. Dave Farley and Jez Humble are finishing up a book on this topic
- Continuous
Delivery. It builds upon many of the ideas that are commonly
associated with Continuous
Integration, driving more towards this ability to rapidly put
software into production and get it doing something. Their section
on blue-green deployment caught my eye as one of those techniques
that's underused, so I thought I'd give a brief overview of it here.  One of the challenges with automating deployment is the cut-over
itself, taking software from the final stage of testing to live
production. You usually need to do this quickly in order to minimize
downtime. The blue-green deployment approach does this by ensuring
you have two production environments, as identical as possible. At
any time one of them, let's say blue for the example, is live. As you
prepare a new release of your software you do your final stage of
testing in the green environment. Once the software is working in the
green environment, you switch the router so that all incoming
requests go to the green environment - the blue one is now idle. Blue-green deployment also gives you a rapid way to rollback - if
anything goes wrong you switch the router back to your blue
environment. There's still the issue of dealing with missed
transactions while the green environment was live, but depending on
your design you may be able to feed transactions to both
environments in such a way as to keep the blue environment as a
backup when the green is live. Or you may be able to put the application
in read-only mode before cut-over, run it for a while in read-only
mode, and then switch it to read-write mode. That may be enough to
flush out many outstanding issues. The two environments need to be different but as identical as
possible. In some situations they can be different pieces of
hardware, or they can be different virtual machines running on the
same (or different) hardware. They can also be a single operating
environment partitioned into separate zones with separate IP
addresses for the two slices. An advantage of this approach is that it's the same basic
mechanism as you need to get a hot-standby working. Hence this
allows you to test your disaster-recovery procedure on every
release. (I hope that you release more frequently than you have a
disaster.) The fundamental idea is to have two easily switchable
environments to switch between, there are plenty of ways to vary the
details. One project did the switch by bouncing the web server
rather than working on the router. Another variation would be to use
the same database, making the blue-green switches for web and domain
layers. This technique has been "out there" for ages, but I don't see it
used as often as it should be. Some foggy combination of Dan North and Jez Humble came up with the
name.
- "IT - more than Tools and Technology" track at QCon London
I’ve been a regular speaker at the QCon and JAOO conferences over the last few years. At QCon London this year, I’m involved in a somewhat off-beat track of talks. Software Developers have often had a habit of focusing primarily on the mechanics of developing software well, and not thinking much about the societal value of that software. This is a common strand in many professions, but as the influence of software development grows, I feel it’s increasingly important that we get more engaged in the consequences of the software we build.
This track is a chance to explore some of these issues. As well as myself, there are talks about some work done with UNICEF to make effective use technology in much poorer parts of the world, the role of IT in reducing carbon footprint, and the contribution of team diversity to innovation.
- Bliki: VersionControlTools
If you spend time talking to software developers about tools, one
of the biggest topics I hear about are version control tools. Once
you've got to the point of using version control tools, and any
competent developers does, then they become a big part of your
life. Version tools are not just important for maintaining a history
of a project, they are also the foundation for a team to
collaborate. So it's no surprise that I hear frequent complaints
about poor version control tools. In our recent ThoughtWorks
technology radar, we called out two items as version control
tools that enterprises should be assessing for use: Subversion and
Distributed Version Control Systems (DVCS). Here I want to expand on
that, summarizing many discussions we've had internally about
version control tools.  But first some pinches of salt. I wrote this piece based on an
unscientific agglomeration of conversations with my colleagues
inside ThoughtWorks and various friends and associates outside. I
haven't engaged in any rigorous testing or structured comparisons,
so like most of my writing this is based on
AnecdotalEvidence. My
personal experience in recent years is mostly subversion and
mercurial, but my usage patterns are not typical of a software
development team. Overwhelmingly my contacts like to work in an
agile-xp approach (even if many sniff at the label) and need tools
that support that work style. I expect many people to be annoyed by
this article. I hope that annoyance will lead to good articles that
I can link to. (After writing this I did do a small
VcsSurvey which didn't undermine my conclusions.) Fundamentally there's three version control systems that get
broad approval: subversion (svn), git, and mercurial (hg).
Behind the Recommendability Threshold
Many tools fail to pass the recommendability threshold. There are two
reasons why: poor capability or poor visibility. Many tools garner consistent complaints from ThoughtWorkers about
their lack of capability. (ThoughtWorkers being what they are, all
tools, including the preferred set, get some complaints. Those
behind the threshold get mostly complaints.) Two in particular
generate a lot of criticism: ClearCase
(from IBM) and TFS
(from Microsoft). One reason they get a lot of criticism is that
they are very popular on client sites, often with company policies
mandating their use (I'll describe a coping strategy for that
at the end). It's fair to say that often these problems are compounded by
company policies around using VCS. I've heard of some truly bizarre
work-flows imposed on teams that make it a constant hurdle to get
anything done. Since the VCS is the tool that enforces these
work-flows, it does tend to get tarred with that brush. I'm not going to go into details about the problems the
poor-capability tools have here, that would be another
article. (This has probably made me even more unpopular in IBM and
Microsoft as it is.) I will, at least for the moment, leave it with
the fact that developers I respect have worked extensively with, and
do not recommend, these products. The second reason for shuffling a tool behind the
recommendability threshold is that I don't hear many comments about
some tools. This is an issue because less-popular tools make it
difficult to find developers who know how to use them or want to
find out. There are many reasons why otherwise good tools can fall
behind there. I used to hear people say good things about Perforce,
but now the feeling seems to be that it doesn't have compelling
advantages over Subversion, let alone the DVCSs. Speaking of DVCSs,
there are more than just the two I've highlighted here. Bazaar, in
particular, is one I occasionally hear good things about, but again
I hear about it much less often then git or Mercurial. Before I finish with those behind the threshold, I just want to
say a few things about a particularly awful
tool: Visual Source Safe, or as I call it: Visual Source
Shredder. We see this less often now, thank goodness, but if you are
using it we'd strongly suggest you get off it. Now. Not just is it a
pain to use, I've heard too many tales of repository corruption to
trust it with anything more valuable than foo.txt. So this leaves three tools that my contacts are generally happy
with. I find it interesting that all three are open-source. Choosing
between these tools involves first deciding between a centralized or
distributed VCS model and then, if you chose DVCS, choosing between
git and mercurial.
Distributed or Centralized
Most of the time, the choice between centralized and distributed
rests on how skilled and disciplined the development team is. A
distributed system opens up lots of flexibility in work-flow, but
that flexibility can be dangerous if you don't have the maturity to
use it well. Subversion encourages a simple central repository
model, discouraging large scale branching. In an environment that's
using Continuous
Integration, which is how most of my friends like to work, that
model fits reasonably well. As a result Subversion is a good choice
for most environments. And although DVCSs give you lots of flexibility in how you
arrange your work-flows, most people I know still base their work
patterns on the notion of a shared mainline repository that's used
with Continuous Integration. Although modern VCS have almost magic
tools to merge different people's changes, these merges are still
just merging text. Continuous Integration is still necessary to get
semantic consistency. So as a result even a team using DVCS usually
still has the notion of the central master repository. Subversion has three main downsides compared to its cooler
distributed cousins. Because distributed systems always give you a local disk copy of
the whole repository, this means that repository operations are
always fast as they don't involve network calls to central
servers. This is a palpable difference if you are looking at logs,
diffing to old revisions, and anything else that involves the full
repository. If this is noticeable on my home network, it is a huge
issue if your repository is on another continent - as we find with
our distributed projects. If you travel away from your network connection to the
repository, a distributed system will still allow you to work with
the repository. You can commit checkpoints of your work, browse
history, and compare revisions on an airplane without a network
connection. The last downside is more of a social issue than a true tool
issue. DVCS encourages quick branching for experimentation. You can
do branches in Subversion, but the fact that they are visible to all
discourages people from opening up a branch for experimental
work. Similarly a DVCS encourages check-pointing of work: committing
incomplete changes, that may not even compile or pass tests, to your local
repository. Again you could do this on a developer branch in
Subversion, but the fact that such branches are in the shared space
makes people less likely to do so. This last point also leads to the argument against a DVCS, that
it encourages wanton branching, that feels good early on but can
easily lead you to merge doom. In particular the
FeatureBranch approach is a popular one that I don't
encourage. As with similar comments earlier I must point out that
reckless branching isn't something that's particular to one
tool. I've often heard people in ClearCase
environments complain of the same issue. But DVCSs encourage
branching, and that's the major reason why I indicate that team
needs more skill to use a DVCS well. There is one particular case where subversion is the better
choice even for a team that skilled at using a DVCS. This case is
where the artifacts you're collaborating on are binary and cannot be
merged by the VCS - for example word documents or presentation
decks. In this case you need to revert to pessimistic locking with
single-writer checkouts - and that requires a centralized
system.
Git or Mercurial
So if you're going to go the DVCS route - which one should you
choose? Mercurial and git get most of the attention, so I feel the
choice is between them. Then the choice boils down to power versus
usability, with a dash of mind-share and the shadow of github. Git certainly seems to be liked for its power. Folks go ga-ga
over it's near-magical ability to do textual merges automatically
and correctly, even in the face of file renames. I haven't seen any
objective tests comparing merge capabilities, but the subjective
opinion favors git. (Merge-through-rename,
as my colleague Lucas Ward defines it, describes the following scenario. I
rename Foo.cs to Bar.cs, Lucas makes some changes to Foo.cs. When we
merge his changes are correctly applied to Bar.cs. Both git and
Mercurial handle this.) For many git's biggest downside was its oft-cryptic commands and
mental model. Ben Butler-Cole phrased it beautifully: "there is this
amazingly powerful thing writhing around in there that will
basically do everything I could possibly ask of it if only I knew
how." To its detractors, git lacks discoverability - the ability to
gradual infer what it does from it's apparent design. Git's
advocates say that much of this is because it uses a different
mental model to other VCSs, so you have to do more unlearn your
knowledge of VCS to appreciate git. Whatever the reason git seems to
be attractive more to those who enjoy learning the internals while
mercurial seems to appeal more to those who just want to do version
control. The shadow of github is important here. Even git-skeptics rate it
as a superb place for project collaboration. Mercurial's equivalent,
bitbucket, just doesn't
inspire the same affection. However there are other sites that may
begin to close the gap, in particular Google Code and Microsoft's Codeplex. (I find Codeplex's use of
Mercurial very encouraging. Microsoft is often, rightly, criticized
for not collaborating well with complementary open source
products. Their use of Mercurial on their open-source hosting site
is a very encouraging sign.) Historically git worked poorly on Windows, poorly enough that
we'd not suggest it. This has now changed, providing you run it using
msysgit and not
cygwin. Our view now is that msysgit is good enough to make
comparison with Mercurial a non-issue for Windows. People generally find that git handles branching better than
Mercurial, particular for short-lived branches for experimentation
and check-pointing. Mercurial encourages other mechanisms, such as
fast cloning of separate repository directories and queue patching,
but git's branching is a simpler and better model. Mercurial does seem to have an issue with large binary files. My
general suggestion is that such things are usually better managed
with subversion, but if you have too few of them to warrant separate
management, then Mercurial may get hung up by the few that you have.
Multiple VCS
There's often value to using more than one VCS at the same
time. This is generally where there is a wider need to use a less
capable VCS than your team wants to use. The case that we run into frequently is where there is a
corporate standard for a deficient VCS (usually ClearCase) but we
wish to work efficiently. In that case we've had success using a
different VCS for day-to-day team team work and committing to the
corporate VCS when necessary. So while the team VCS will see
several commits per person per day, the corporate VCS sees a
commit every week or two. Often that's what the corporate admins
prefer in any case. Historically we've done this using svn as the
local VCS but in the future we're more likely to use a DVCS for
local fronting. This dual usage scenario is also common with git-svn where people
use git locally but commit to a shared svn system. Git-svn is another
reason for preferring git over mercurial. Using a local DVCS is
particularly valuable for remote site working, where network outages
and bandwidth problems can cripple a site that's remote from a
centralized VCS. A lot of teams can benefit from this dual-VCS working style,
particularly if there's a lot of corporate ceremony enforced by
their corporate VCS. Using dual-VCS can often make both the local
development team happier and the corporate controllers happier as
their motivations for VCS are often different.
One Final Note
Remember that although I've jabbered on a lot about tools here,
often its the practices and workflows that make a bigger
difference. Tools can certainly make it much easier to use a good
set of practices, but in the end it's up to the people to use an
effective way of working for their environment. I like to see
approaches that allow many small changes that are rapidly integrated
using Continuous Integration. I'd rather use a poor tool with CI
than a good tool without.
- Texas Speaking Events Rescheduled
The family medical issue has been resolved happily, so I’m free to go back on the road. We’ve thus rescheduled the events I was supposed to do last month in Texas.
- On February 23rd I’ll be speaking at DFW Scrum in Dallas.
- On February 25th ThoughtWorks is organizing a technology forum in Austin.
As is usual for me, I haven’t planned exactly what I’ll talk about yet, but it’ll revolve around my usual topics of software design and agile methods.
- Bliki: ConversationalStories
Here's a common misconception about agile methods. It centers on
the way user stories are created and flow through the development
activity. The misconception is that the product owner (or business
analysts) creates user stories and then put them in front of
developers to implement. The notion is that this is a flow from
product owner to development, with the product owner responsible for
determining what needs to be done and the developers
how to do it.  A justification for this approach is that this separates the
responsibilities along the lines of competence. The product owner
knows the business, what the software is for, and thus what needs to
be done. The developers know technology and know how to do things,
so they can figure out how to realize the demands of the product
owner. This notion of product owners coming up with
DecreedStories is a profound misunderstanding of the way
agile development should work. When we were brainstorming names at
Snowbird, I
remember Kent suggesting "conversational". This emphasized the fact
that the heart of our thinking was of an on-going conversation
between customers and developers about how a development project
should proceed.  In terms of coming up with stories, what this means is that they
are always something to be refined through conversation - and that
developers should play an active role in helping that
definition. - spotting inconsistencies and gaps between the stories
- using technical knowledge to come up with new stories that
seem to fit the product owner's vision
- seeing alternative stories that would be cheaper to build
given the technological landscape
- split stories to make them easier to plan or implement
This is the Negotiable principle in Bill Wake's INVEST test for
stories. Any member of an agile team can create stories and suggest
modifications. It may be that just a few members of a team gravitate
to writing most of the stories. That's up to the team's
self-organization as to how they want that to happen. But everyone
should be engaged in coming up and refining stories. (This
involvement is in addition to the develpers' responsibility to
estimate stories.) The product owner does have a special responsibility. In the end
the product owner is the final decider on stories, particularly
their prioritization. This reflects the fact that the product owner
should be the best person to judge that slippery attribute of
business value. But having a final decision maker should never stop
others from participating, and should not lead people astray into a
decreed model of stories.
- Bliki: DslBookRoadmap
Time for another update on my DSL book's progress, since I've not
been writing anything else recently. I had my first round of technical review late in 2009 and have
been incorporating comments into the current drafts. Progress on
this has gone well, in large part because travel is light this time
of the year. I'm also integrating my book production process into
that of Pearson's. The next visible targets are a second round of technical review
and the launching of a roughcut. We're hoping to get these going in
the next couple of months. The roughcut will also allow people other
than official reviewers the chance to throw rocks at the text. After that the material will be gradually readied for
production. We're going to use a much more incremental process than
I've used before, which will be both good and interesting. My sense
at the moment that we'll see a physical book on bookshelves by the
final quarter of 2010. It's currently looking at around 500 pages
total in a DuplexBook split 150/350 The material currently on my web-site was
last updated in June. While I've done quite a lot of detailed work
on the book since, the broad structure is pretty similar, so the
website gives a reasonably good picture of the scope of content.
- Apologies for Canceling Texas Speaking Events
I’m afraid I’ve had to cancel my speaking events in Dallas and Austin next week due to a family medical problem. As I write this, it’s not clear how serious the problem is going to be, but there is a good chance that I won’t be able to travel to Texas next week. As a result we felt it was best to cancel the events, while we still have a few days notice. We do intent to reschedule as soon the as dust settles. My Texas ThoughtWorkers are very keen to have me come out and do these talks, so we want to do them as soon as we reasonably can.
My apologies for this, and I hope you understand. In particular I want to thank the various collaborators in organizing these events for being very understanding under the awkward circumstances.
- Bliki: TechnicalDebtQuadrant
There's been a few posts over the last couple of months about
TechnicalDebt that's raised the question of what kinds of design
flaws should or shouldn't be classified as Technical Debt. A good example of this is Uncle Bob's post saying a
mess is not a debt. His argument is that messy code, produced by
people who are ignorant of good design practices, shouldn't be a
debt. Technical Debt should be reserved for cases when people have
made a considered decision to adopt a design strategy that isn't
sustainable in the longer term, but yields a short term benefit,
such as making a release. The point is that the debt yields value
sooner, but needs to be paid off as soon as possible. To my mind, the question of whether a design flaw is or isn't
debt is the wrong question. Technical Debt is a metaphor, so the
real question is whether or not the debt metaphor is helpful about
thinking about how to deal with design problems, and how to
communicate that thinking. A particular benefit of the debt metaphor
is that it's very handy for communicating to non-technical people. I think that the debt metaphor works well in both cases - the
difference is in nature of the debt. A mess is a reckless debt which
results in crippling interest payments or a long period of paying
down the principal. We have a few projects where we've taken over a
code base with a high debt and found the metaphor very useful in
discussing with client management how to deal with it. The debt metaphor reminds us about the choices we can make with
design flaws. The prudent debt to reach a release may not be
worth paying down if the interest payments are sufficiently small -
such as if it were in a rarely touched part of the code-base. So the useful distinction isn't between debt or non-debt, but
between prudent and reckless debt. There's another interesting distinction in the example I just
outlined. Not just is there a difference between prudent and
reckless debt, there's also a difference between deliberate and
inadvertent debt. The prudent debt example is deliberate because the
team knows they are taking on a debt, and thus puts some thought as
to whether the payoff for an earlier release is greater than the
costs of paying it off. A team ignorant of design practices is
taking on its reckless debt without even realizing how much hock
it's getting into. Reckless debt may not be inadvertent. A team may know about good
design practices, even be capable of practicing them, but decide to
go "quick and dirty" because they think they can't afford the time
required to write clean code. I agree with Uncle Bob that this is
usually a reckless debt, because people underestimate where the
DesignPayoffLine is. The whole point of good design and
clean code is to make you go faster - if it didn't people like Uncle
Bob, Kent Beck, and Ward Cunningham wouldn't be spending time
talking about it. Diving debt into reckless/prudent and deliberate/inadvertent
implies a quadrant, and I've only discussed three cells. So is there
such a thing as prudent-inadvertent debt? Although such a thing
sounds odd, I believe that it is - and it's not just common but
inevitable for teams that are excellent designers. I was chatting with a colleague recently about a project he'd
just rolled off from. The project that delivered valuable software,
the client was happy, and the code was clean. But he wasn't happy
with the code. He felt the team had done a good job, but now they
realize what the design ought to have been. I hear this all the time from the best developers. The point is
that while you're programming, you are learning. It's often the case
that it can take a year of programming on a project before you
understand what the best design approach should have been. Perhaps
one should plan projects to spend a year building a system that you
throw away and rebuild, as Fred Brooks suggested, but that's a
tricky plan to sell. Instead what you find is that the moment you
realize what the design should have been, you also realize that you
have an inadvertent debt. This is the kind of debt that Ward talked
about in his
video. The decision of paying the interest versus paying down the
principal still applies, so the metaphor is still helpful for this
case. However a problem with using the debt metaphor for this is
that I can't conceive of a parallel with taking on a
prudent-inadvertent financial debt. As a result I would think it
would be difficult to explain to managers why this debt appeared. My
view is this kind of debt is inevitable and thus should be
expected. Even the best teams will have debt to deal with as a
project goes on - even more reason not to recklessly overload it
with crummy code.
- Bliki: FeatureBranch
With the rise of Distributed Version Control Systems (DVCS) such
as git and Mercurial, I've seen more conversations about strategies
for branching and merging and how they fit in with Continuous
Integration (CI). There's a bit of confusion here, particularly
on the practice of feature branching and how it fits in with CI.
Simple (isolated) Feature Branch
The basic idea of a feature branch is that when you start work on
a feature (or story if you prefer that term) you take a branch of
the repository to work on that feature. In a DVCS, you'll do this
in your personal repository, but the same kind of thing works in a
centralized VCS too. I'm going to illustrate this with a series of diagrams. I have a
shared project mainline, colored blue, and two developers, colored
purple and green (since the developers names are Reverend Green and
Professor Plum).  I'm using labeled colored boxes (eg P1 and P2) to represent
local commits on the branch. Arrows between branches represent
merges between branches, the boxes are colored orange to make them stand
out. In this case there are updates, say a couple of bug-fixes,
applied to the mainline (presumably by Mrs Peacock). When these
happen our developers merge them into their work. To give this a
sense of time, I'll assume we're looking at a few days work here,
with each developer committing to their local branch roughly once a day. In order to ensure things are working properly, they can run
builds and tests on their branch. Indeed for this article I'll
assume that each commit and merge comes with an automated build and
test on the branch it's on. The advantage of feature branching is that each developer can
work on their own feature and be isolated from changes going on
elsewhere. They can pull in changes from the mainline at their own
pace, ensuring they don't break the flow of their
feature. Furthermore it allows the team to choose its features for
release. If Reverend Green takes too long, we can release with just
Professor Plum's changes. Or we may want to delay Professor Plum's
feature, perhaps because we are uncertain that the feature works the
way we want to release it. In this case we just tell the professor
to not merge his changes into mainline until we are ready for the
feature. This is called cherry-picking, the team decides
which features to merge in before release. Attractive though that picture looks, there can be trouble
ahead.  Although our developers can develop their features in isolation,
at some point their work does have to be integrated. In this case
Professor Plum easily updates the mainline with his own
changes. There's no merge here because he's already incorporated the
mainline changes into his own branch (there will be a build). Things
are however not so simple for Reverend Green, he needs to merge all
of his changes (G1-6) with all of Professor Plum's (P1-5). (At this point many users of DVCSs may feel I'm missing
something as this is a simple, perhaps simplistic view of feature
branching. I'll get to a more involved scheme later.) I've made this a big merge box as it's a scary merge. It may be
just fine, the developers may have been working on completely
separate parts of the code base with no interaction, in which case
the merge will go smoothly. But they may be working on bits that do
interact, in which case here lye dragons. The dragons can come in many forms, and tooling can help slay
some of them. The most of obvious dragon is the complexity of
merging the source code and dealing with conflicts as developers
edit the same files. Modern DVCSs actually handle this rather well,
indeed somewhat magically. Git has quite the reputation for dealing
with complicated merges. So much so that the textual issues of
merging are much better than they used to be - indeed I'll go so far
as to discount textual conflicts for the purposes of this
article. The problem I worry more about is a semantic conflict. A simple
example of this is that if Professor Plum changes the name of a method
that Reverend Green's code calls. Refactoring tools allow you to
rename a method safely, but only on your code base. So if G1-6
contain new code that calls foo, Professor Plum can't tell in his
code base as he doesn't have it. You only find out on the big merge. A function rename is a relatively obvious case of a semantic
conflict. In practice they can be much more subtle. Tests are the
key to discovering them, but the more code there is to merge the
more likely you'll have conflicts and the harder it is to fix
them. It's the risk of conflicts, particularly semantic conflicts,
that make big merges scary. This fear of big merges also acts as a deterrent to
refactoring. Keeping code clean is constant effort, to do it well it
requires everyone to keep an eye out for cruft and fix it wherever
they see it. However this kind of refactoring on a feature branch is
awkward because it makes the Big Scary Merge worse. The result we
see is that teams using feature branches shy away from refactoring
which leads to uglier code bases.
Continuous Integration
It's these problems that Continuous Integration was designed to
solve. With Continuous Integration my diagram looks like this.  There's a lot more merging going on here, but merging is one of
those things that's much easier to do frequently and small rather
than rarely and large. As a result if Professor Plum is changing
some code that Reverend Green relies on, the Reverend will find it
early, such as when he merges in P1-2. At that point he's only got
to modify G1-2 to work with the changes, rather than G1-6. CI is effective at removing the problem of big merges, but it's
also a vital communication mechanism. In this scenario the potential
conflict will actually appear when Professor Plum merges G1 and
realizes that Reverend Green is actively building on Plum's
libraries. At this point Professor Plum can go and find Reverend
Green and they can discuss how their two features interact. It may
be that Professor Plum's feature requires some changes that don't
mesh well with Reverend Green's changes. By looking at both their
features they can come up with a better design that affects both
their work-streams. With the isolated feature branches our
developers don't discover this till late, probably too late to do
much about it. Communication is one of the key factors in software
development and one of CI's most important features is that it
facilitates human communication. It's important to note that, most of the time, feature branching
like this is a different approach to CI. One of the principles of CI
is that everyone commits to the mainline every day. So unless
feature branches only last less than a day, running a feature branch
is a different animal to CI. I've heard people say they are doing CI
because they are running builds, perhaps using a CI server, on every
branch with every commit. That's continuous building, and a Good
Thing, but there's no integration, so it's not CI.
Promiscuous Integration
Earlier I said parenthetically that there are other ways of doing
feature branching. Say Professor Plum and Reverend Green take tea
together early in the cycle. While chatting they discover they are
working on features that interact. At this point they may choose to
integrate with each other directly, like this.  With this approach they only push to the mainline at the end, as
before. But they merge frequently with each other, so this avoids
the Big Scary Merge. The point here is that the primary issue with
the isolated feature branching scheme is its isolation. When you
isolate the feature branches, there is a risk of a nasty conflict
growing without you realizing it. Then the isolation is an illusion,
and will be shattered painfully sooner or later. So is this more ad-hoc integration a form of CI or a different
animal entirely? I think it is a different animal, again a key point
of CI is everyone integrates to the mainline every
day. Integrating across feature branches, which I shall call
promiscuous integration (PI), doesn't involve or even need a
mainline. I think this difference is important.
I see CI as primarily giving birth to
a release candidate at each commit. The job of the CI system and
deployment process is to disprove the production-readiness of a
release candidate. This model relies on the need to have some
mainline that represents the current shared, most up to date
picture of complete.
--Dave Farley
Promiscuous Integration vs Continuous Integration
So if it's different is PI better than CI, or more
realistically under what circumstances is PI better than CI? With CI, you lose the ability to use the VCS to do cherry
picking. Every developer is touching mainline, so all features grow
in the mainline. With CI, the mainline must always be healthy, so in
theory (and often in practice) you can safely release after any
commit. Having a half built feature or a feature you'd rather not
release yet won't damage the other functionality of the software,
but may require some masking if you don't want it to be visible in
the user-interface. This can be as simple as not including a menu
item in the UI to trigger the feature. PI can provide some middle ground here. It allows Reverend Green
the choice of when to incorporate Professor Plum's changes. If
Professor Plum makes some core API changes in P2, then Reverend
Green can import P1-2 but leave the others until Professor Plum's
feature is put onto the release. One worry with all this picking and choosing is that PI makes it
really hard to keep track of who has what in their branch. In
practice, it seems tooling pretty much solves this problem. DVCSs
keep a clear track of changes and their origins and can figure out
that when Professor Plum pulls G3 he already has G2 but doesn't have
B2. I may have made mistakes drawing the diagram by hand, but tools
do keep track of these things well. On the whole, however, I don't think cherry-picking with the VCS
is a good idea.
Feature Branching is a poor man's
modular architecture, instead of building systems with the ability
to easy swap in and out features at runtime/deploytime they couple
themselves to the source control providing this mechanism through
manual merging.
--Dan Bodart
I much prefer designing the software in such a way that makes it
easy to enable or disable features through configuration changes. My
colleague Paul Hammant calls this Branch by
Abstraction. This requires you to put some thought into what
needs to be modularized and how to control that variation, but we've
found the result to be far less messy that relying on the VCS. The main thing that makes me nervous about PI is the influence on
human communication. With CI the mainline acts as a communication
point. Even if Professor Plum and Reverend Green never talk, they
will discover the nascent conflict - within a day of it
forming. With PI they have to notice they are working on interacting
code. An up-to-date mainline also makes it easy for someone to be
sure they are integrating with everyone, they don't have to poke
around to find out who is doing what - so less chance of some
changes being hidden until a late integration. PI arose out
of open-source work, and it could be that the less intensive tempo
of open-source could be a factor here. In a full time job, you work
several hours a day on a project. This makes it easier for features
to be worked in priority. With an open source project people often
put in a hour here, and the next hour a few days later. A feature
may take one developer quite a while to complete while other
developers with more time are able to get features into a releasable
state earlier. In this situation cherry picking can be more
important. It's important to realize that the tools you use are largely
independent of the integration strategy you use. Although many
people associate DVCSs with feature branching, they can be used with
CI. All you need to do is mark one branch on one repository as the
mainline. If everyone pulls and pushes to that every day, then you
have a CI mainline. Indeed with a disciplined team, I would usually
prefer to use a DVCS on a CI project than a centralized one. With a
less disciplined team I would worry that a DVCS would nudge people
towards long lived branches, while a centralized VCS and a
reluctance to branch nudges them towards frequent mainline
commits. Paul Hammant may be right: "I wonder though, if a team
should not be adept with trunk-based development before they move to
distributed."
- Bliki: DigitalSLR
-
Like many geeks I'm into photography. We geeks like photography
because it provides the veneer of an artistic endeavor while allowing
us to indulge in lots of technical details and spend money on
expensive toys. A friend recently asked about my camera buying decisions, a
question that prompted me to write them down. I got my first digital
SLR a year ago. Before that I had owned a film SLR for many years, but
started using digital cameras around 2000. I found the convenience of
digital to be compelling and stopped using the film camera. I toyed
with getting a digital SLR in 2004, but instead decided on a high end
fixed lens camera - the Minolta A1. I enjoyed using it, but it conked
out late in 2007. I considered a similar kind of camera, something
like a Canon S5, but decided to bite the SLR bullet. My first decision, and a critical one, was which system to
buy. This is the critical decision as it's difficult (ie expensive) to
reverse. Once you pick your system, you'll then commit money to it by
buying lenses and the cost of switching is more than a dabbler like
me can go with. I felt that the best choice was to go with the big two
- Canon or Nikon. The choice between them was pretty much arbitrary, I
ended up choosing Canon because a friend we occasionally vacation with
has a Canon. A trifling distinction, but really the choice between the
two wasn't a big one. I'm still reasonably happy with it. One misgiving is that the
technological advantage seems to have tipped in Nikon's favor over the
last year, at least according to the blogs I read, but it's a tight
race and Canon could well come back. I've also been recently
intrigued by the new Micro Four Thirds format. Early days (and not
around last year) but the small size and weight are very important to
me. With Canon as the choice, the next step was the initial choice of
body and lenses. My approach was to get pretty much the cheapest body
I could (the Digital Rebel XTI) because I'd rather spend more money on
lenses than on the body. The whole point of SLRs is to have good
lenses, so I'd rather concentrate my limited dollars there. Cameras
also get upgraded much more frequently, so I'm likely to upgrade the
camera in a few years while lenses stay current for much longer.
So which lenses? I forgoed the kit lens and got the camera
body-only. As my main lens I went with a mega-zoom, the Sigma
18-200. Serious photographers will, probably rightly, turn their noses
up at this lens. But I'm a dabbler. Most of my photos will only be
seen on my screensaver or on a web page. A few get printed for a wall
of our house, but only on a regular letter size printer. So I doubt
that I'd appreciate the difference of a higher quality
lens. Furthermore I can shoot within its limits. Reviews suggest that
if you stick to f9, the quality stays pretty good. Since I'm
mostly using it outside during the day, that limitation is easy to
live with. As a result I tend to set my camera to aperture priority
with f9, and that covers most of my shots. The advantages of a single mega-zoom are considerable to me. Most
of my photographs are taken while I'm doing something else, often with
others around. I don't go out much to just shoot. In that situation
even changing lenses can be a significant deterrent to getting a
shot. Furthermore size and weight are a big deal when I'm
travelling. While the lens isn't exactly svelte, it's much more compact
that the alternative ways of getting that kind of zoom range. A final
bonus is that it's image stabilized, which allows me to use it for
static interiors. The mega-zoom stays on my camera most of the time, but it wasn't
the only lens I got with my camera. I also picked up the f1.8
50mm. This is an easy lens to get, very cheap, very light, very small
but produces great quality. Since it's the equivalent of a 80mm on
35mm film, it's ideal for portrait photos - particularly with the f1.8
aperture. I use it a lot for shooting people in low light conditions. I toyed with other lenses, but I wanted to get used to those two
before I plonked money on any more. After a few months with the camera I turned my eyes to a
tripod. There are varying views on the net about tripods, some feel
you should only use them if you really have to, some that you should
use them whenever you can. I do like having one around, particularly
for crepuscular shooting. I had a cheap and crummy silk tripod, but
Duncan's blog persuaded me that I should get something better. I
didn't go for his preferred Gitzos (beyond my budget) but I did get
a light Induro tripod, together with a Really Right Stuff head and
fast release clamp. I went for the lightest setup I could get, as I wanted something
that I'd actually be prepared to carry around and my camera/lens
combos aren't particularly heavy. The fast release clamp was important
as I'm someone who like to move around when shooting and such a clamp makes
a big difference. In hindsight I wish I'd paid the extra for an L
clamp, as I do find it frustrating to futz with the head when
switching orientations.
It was only a month or so more before I went for another lens. A
trip out to Colorado and Utah was the trigger to think about
something wider than the 18-200 would go. I considered the Tokina 11-16
and the Canon 10-22, going for the latter due to it being
lighter. It's a fun lens to use, allowing a few different things than
what my regular lens provides. In particular what's interesting to
work with is the huge depth of field you can get with an ultra wide:
at 10mm you can easily get everything from a foot to infinity. This is probably a reasonable moment to talk about
filters. There's a good bit of discussion on the net about whether
putting on a UV filter is worthwhile. I decided to get one for the 18-200
as it's on my camera so much, but not to get ones for my other lenses
as I use them much less and am prepared to be more careful when those
lenses are on the camera. For the mega-zoom I also picked up a
polarizing filter, which I carry around with me all the time, but
frequently forget to use. The other issue that obsesses camera people is how to carry all
this stuff. All things being equal, I like weight on my waist. So I
went for a waist belt (from Tamrac, due to the
double belt layout) and a Think Tank holster. I like the Think Tank's
ability to extend when I have the hood on my lens. The only problem is
that there are plenty of occasions when a waist belt isn't an
option. The holster comes with a shoulder strap, which is fine, but I
usually want the 10-22 as well. Cindy came to the rescue, sewing some
straps onto the side of the holster so I can attach a lens pouch. To keep track of my photos, and to do some post-processing, I got a
copy of Apple's Aperture. (It seemed a toss up between Aperture and
Lightroom.) I find it works well, better than sticking with iPhoto. The latest lens I added to my collection is the Canon f2 100mm. I got
this for shooting indoors, particularly at conferences for shooting
someone on stage. In those situations I need more reach than the 50mm,
but I still want a really fast aperture at a price and weight that's
rather less than the serious zooms. So far I've only used the 100mm a
couple of times, but have been very happy with it. That burst of buying isn't something I expect to maintain. The
quartet of lenses I have is pretty suited to my needs. There are some
more I'm eyeing. The Canon
100-400mm zoom would be great for wildlife shots, but frankly
we're rarely in the situation where I'd use it, so it's hard to
justify its high cost. A different situation that regularly tickles my
mind is cases where I'm primarily at a conference (so have the 50 and
100mm) but don't want to lug the 18-200 and want to have something
wider. I could take the 10-22, but that leaves a gap and is less light
than I'd like. Ironically this suggests the (now updated) kit
zoom which is cheap and light. The primes less than 50mm are
either too heavy, too expensive, or seem to have less quality than the
kit zoom. If you're curious, here are the results, (because there just aren't
enough holiday snaps on the web.)
- Bliki: SelfInitializingFake
One of the classic cases for using a TestDouble is
when you call a remote service. Remote services are usually slow and
often unreliable, so using a double is a good way to make your tests
faster and more stable. When you're querying a remote service, you need to find a way to
load the expected data into your double. One way to do this is to
use what I'm dubbing a self-initializing fake. The basic plan is
simple. The first time you call the fake it passes the call onto
the actual remote service, and as it returns the data it takes and
saves a copy. Further calls just return the copy. In a sense this is
like a cache, but with the important difference that there is no
attempt to handle cache invalidation, which is handy as that's one
of the TwoHardThings.  I've called this a fake, as that seems the closest fit from the
various varieties of test doubles. The other reasonable alternative
is a stub, but the distinction here is that a stub needs setting up
when you build the fixture, while fakes are autonomous. The interesting thing about a self-initializing fake is how you deal
with situations where the remote service changes it's response. One time I saw this approach was with a database controlled by
another application. In this case the data did change,
frequently. This is unhelpful for tests, because automated tests
rely on getting the same answers to the same questions. But usually
tests don't care whether the data is up to date or not, so saving an
old value worked just fine. I ran into this again recently while chatting with my colleague
Josh Price. In his case the remote data was supposedly static, but
occasionally there were changes, which would imply that the system
he was developing needed to change - usually to handle formatting
issues. In this case he had a special test suite that would get all
self-initializing fakes to call the remote service and check that they
returned the same value that was saved. In this case early stages of their build pipeline ran against the
fake, and the last (slowest) stage ran against the service
itself. One interesting problem was that the remote service required
some unimportant parameters which changed from call to call but
didn't alter the results. These were stripped out of the URL when
the fake looked the values up from the store. (Thanks to Josh Price, Darren Cotterill, and Gerard Meszaros for
their help with this piece.)
|
|