Deep Shah's Blog: continuous integration

Showing posts with label continuous integration. Show all posts

Saturday, May 28, 2011

How to build the web project on the CI server without Visual Studio 2010

On one of the ASP.Net project, we were using Visual Studio 2010 IDE and MSBuild, to compile and build the application. Pretty standard setup for a web project right?. On developer machine all worked very smoothly, since each developer had Visual Studio 2010 installed locally on their machines.

We setup a continuous integration (CI) server Jenkins (formally Hudson) for this project. Naturally, we didn't wanted to install Visual Studio 2010 on the CI server. But we still wanted to build our app on the CI server. Hence, we installed Microsoft Windows SDK 7.1 and all its components on the CI server. After installing the SDK, we were able to build all projects except the web project on the CI server.

The error that we were getting was

Obviously, since the Visual Studio 2010 was not installed on the CI server, the path mentioned in the error didn't exist and that's why the web project was not able to build.

I opened the csproj file in Notepad++, to find the reference of "Microsoft.WebApplication.targets". I found a reference of "Microsoft.WebApplication.targets" in the csproj which looked liked

Indeed the web csproj file had an import statement, which added the reference to "Microsoft.WebApplication.targets". But I was not sure why this reference was needed in the first place.

My first attempt at fixing the CI server:

I removed the reference of "Microsoft.WebApplication.targets" from the web csproj file and tried building the application using MSBuild. To my surprise, the app build fine, it ran all the tests without any issue. My confidence grew a little, I thought I had the solution. I thought the reference was added to the csproj by Visual Studio 2010, but was never used.

I committed the fix, build went green on the CI Server as well!

After a few hours, some developers complained that, when they now open up the web project, Visual Studio 2010 prompted for converting the project into a Visual Studio 2010. Certainly, this behaviour was triggered after I removed the import statement from the web csproj file.

Its pretty clear now, why the import was required. Visual Studio 2010 identifies that, the project is a Visual Studio 2010 Web project using the import of "Microsoft.WebApplication.targets". Hence, although the build run's fine using MSbuild on developer machine as well as on CI server this is not the right fix!

My second attempt at fixing the CI server:

Googling a little I found that, there are only two options to fix this particular issue.

Install Visual Studio 2010 on the CI server
Copy the "Microsoft.WebApplication.targets" file manually from any developer machine to the CI server

It was sad to see that there was no standalone installer that installed the "Microsoft.WebApplication.targets" at the desired location. We had to either install the full Visual Studio 2010 or copy the file manually.

I decided to chose the lesser evil of the two. We copied the file "Microsoft.WebApplication.targets" from my machine (located at path C:\Program Files (x86)\MSBuild\Microsoft\VisualStudio\v10.0\WebApplications on a Windows 7 64 Bit installation) onto the CI server (at the path C:\Program Files\MSBuild\Microsoft\VisualStudio\v10.0\WebApplications. CI server has Windows 2003 OS installed on it.) and then ran the build. Yep! It worked! The build went green on the CI server.

I know, its just an hack, but for now I had no other option!

Sunday, November 21, 2010

GIT - Version control done the right way - Part - 2

In the previous post, I naratted how life was before GIT came into picture. This post is a continuation of the previous post. In this post, I will narrate how moving to GIT over Subversion (SVN) helped us fix the issues we were facing with pure SVN.

So, A short recap,

Problems with SVN:

Its not easy to work on multiple streams of work. SVN is not build for that.
No offline commits. No private commits (commits that only I can see and make them public when I want). There is no way in which I can commit and not make those changes publicly available.

How did we get around those problems using GIT over SVN:

There was no way that our team could have moved to pure GIT. Other teams working on the same projects where using SVN repository for their commits. We had to work on the SVN repository for sure. Hence, we decided to use GIT over SVN. SVN will still be our central repository. All updates were taken from and commits were made to the same central SVN repository.

GIT has excellent support for SVN repositories. We use the git svn command. Heres how it helped us.

We started of with a small team moving towards using GIT. Out of 120 developers 10 developers started using GIT. In the first week, boy we had a tough time convincing people to give GIT a fare chance.

The process of unlearning SVN is a little difficult. One has to be really diligent and patient at least for a week. After that, trust me on this, you will never want to work with anything else.

Moving on, the situation of the continuous integration server was still the same. Narrow commit windows and long build times.

Lets say, Deep and Jamie are in the middle of developing Feature - 1. They have been committing to their local GIT repository as often as they want. At every logical point (the point at which all their tests are passing and code is in a good shape) they do a local commit. This commit is only visible to them and not to anyone else.

Lets say, the build goes green. They, don't need to hastily push their changes to SVN any more. If their Feature - 1 development is not complete, they don't need to push their changes to SVN. Even without pushing to SVN, they are getting all the benefits of a Version Control system.

For e.g. they made 7 local commits for getting the feature done. This means they have 7 commits in their local GIT repository, but SVN does not know anything about them. They have got the benefit of version control even without making their changes public.

When they decide the feature is read to go public they have the following options (based on the situation of build)

Push their changes to SVN, if build is green
Not push their changes to SVN, if build is red

If they get a chance to push to SVN, they are pretty confident that their changes are in a good state and build will go green with their changes.

If they do not get a chance to commit to SVN, they can continue development of Feature - 2 on a different local branch!

Eventually when the build goes green after n hours (where n > 1), they can quickly switch to the Feature - 1 branch and commit only Feature - 1 changes. In this situation they are confident that only and only Feature - 1 changes will be committed! That is exactly they wanted!

After pushing the Feature - 1 changes to SVN they can switch back to Feature - 2 local branch and then continue development from where they left! Awesome isn't!

Working with multiple streams of work is pretty natural with GIT.

Over and above everything else, they could commit at every logical point. This is really important!

The advantage of doing this is, one could go back in time and have a look at how the file A.java looked at logical point - 2.

When they were using only SVN, this was never possible. They could only see the entire Feature - 1 change set as one big fat commit. They could never go back in time and see how the file A.java had evolved while Feature - 1 was being developed.

We seen so far, only two main advantages of using GIT even with SVN. This is just the tip of the ice burg. GIT is pretty awesome!

But, the flip side is you need to be patient till you unlearn SVN and feel the power of GIT.

Give GIT a fare chance and you will do things you thought were impossible thus far! Do not wait any more, Go GIT it!

Monday, November 15, 2010

GIT - Version control done the right way - Part - 1

A lot of people have asked me why the name of my blog is www.gitshah.com, the fact is, I am a big advocate of GIT.

I am not going to bore you with, what GIT can do and what it can't (well I am yet to find out what it can't do). In this series of two post, I am going to share with you guys, a real life incident that changed lives of a few developers, changed their perspective about version control forever.

I was working in a techie company, in a team of around 120 smart developers (yes all of them working on one project. And no, we were not building the next generation rocket that will travel faster than the speed of light).

The version control system we were using was Subversion (SVN). We had an continuous integration server setup. Which means as soon as someone checks-in any code, build is triggered and all test cases are run. If the build breaks, people know something is wrong and someone needs to fix it.

Looks like a standard project? Well, you will see. Continuous integration was the most important aspect of this project, imagine 120 smart developers churning out code every minute at least for 7 hours a day (man that is a lot of code! I agree). In this situation one has to be very careful about what goes into the central SVN repository. This is important because, we need the build to remain green as much as possible.

To do that we had a simple rule

NEVER COMMIT ON A BROKEN BUILD, UNLESS YOUR COMMIT WILL FIX THE BUILD.

Well it sounds fair, no commits on a broken (or red) build. If you are fixing the build, then of course you are allowed to commit. Else build would remain broken for eternity!

All this is nice and rosy but there is one problem. Because the project was so big, full build took around 1 hour (sometimes even more than an hour). Wow! one hour (What where we building? I think Linux kernel builds must be faster than this) before which someone can find out whether his/her changes are good or not.

Whats wrong with that?

Well consider this scenario. Deep and Jamie have been working on developing the Feature 1. When working with SVN they are always in a dilemma whether to commit or not to commit.

Why?

Because, When you check new code in, everybody else gets it. As soon as someone commits to SVN, its made public.

Hence, they have two choices:

Check in half backed, buggy code and drive everyone else crazy
Avoid checking it in until the feature is completely developed

Deep and Jamie prefer to take the path - 2. They decide to not commit any code till the Feature - 1 is developed and code is in a stable state. Typically, feature development may take 2-4 days. No commits for 2-4 days! No version history for 2-4 days!

From the start of feature development till the end, developers cross many milestones or logical points. At these points, code is relatively stable and does a specific task well. All tests are passing, but the feature is not yet complete. All those logical points are really crucial. But with SVN these logical points are lost in time.

Since they cannot commit to SVN, till the feature is working, they have no choice but to continue coding. After 3 days they are done with the development. Now, they have a big change set (code changes worth 3 days) that they want to commit to SVN. There are two possibilities now:

They can commit to SVN, because the build is green (not broken)
They cannot commit to SVN since the build is red (broken).

If they can commit at this point then, everything is well and good.

But, if they cannot commit at this point then, again they have two choices

They can wait for the build to go green (and in the mean time can play some game on XBOX or have some noodles)
Start development of Feature - 2. Commit changes set of Feature -1 when the build goes green.

Deep and Jamie, cannot wait for the build to go green. Build takes an hour to go green. They can't loose this many billable hours doing nothing. You see, they are responsible and committed developers. They move on and start the development of Feature - 2.

Lets say, after another hour build goes green.

Mission - commit Feature - 1 begins!

Remember they were in a good state before an hour. All tests were passing and Feature - 1 was complete. But since then, they have done more coding (remember they had started development of Feature - 2). They want to commit changes of Feature - 1 but not those of Feature - 2.

What do they do, what do they do? Time is running out. They must commit now. They can't wait any longer to commit Feature - 1.

They decide to selectively check-in Feature 1 files. They take a call, on whether to check-in a specific file or not based on their programmer instinct. They hope and pray that, the files they have checked-in, form a logical change set for Feature - 1. Build is triggered. While their changes are being built. Asif and Sandeep check-in their changes. Remember build takes an hour, everyone is back to churning out more code.

Deep and Jamie's prayers where not answered! The build breaks.

Build is broken! Fun starts now! Count the number of hours before which the build is green again.

Since Asif and Sandeep had already checked-in (they had checked in when the build was green and it was building some changes), one more hour is lost . Naturally, since build was broken before their changes where integrated, this build is on a death march. Build is in broken state since an hour now.

Other developers have already started cursing, Deep and Jamie for breaking the build. But wait, Deep and Jamie are thinking their changes can never break the build, their code is flawless! They had tested the Feature - 1 so many times. Finally they realize, the files they had checked-in did not form a complete changes set required for Feature - 1 (remember they had selectively checked-in the files). They forgot to check-in a file which was actually required. Damn!

It has already been an hour since they last checked-in, they have done some more coding. To fix the build they have to got back to the original check-in state, check-in the required file and hope that the build goes green. It all looks very complex to me!

This is a risky situation, developer is not sure about which files to check-in.

Although, they could have taken a patch of Feature - 1 before starting development for Feature - 2, revert all changes before committing, apply Feature - 1 patch and commit to SVN. Anyone who has done this even once would agree that this never works smoothly.

The fact is Working on multiple streams of work is not natural in SVN.

After their check-in, build need to run again. It runs for another hour after which it goes green. Out of 8 hours of working day build was broken for 2 hours!

Extrapolate this situation with 120 developers. As the number of developers increase the problem gets bigger and bigger. Everyone wants to check-in what ever code they have written. When the build goes green, may be, they are not in state to commit. They might have crossed the logical point. End result, no matter however strong your instinct are, you can never be sure that your check-in has all the changes you intended to commit.

Why does this happen?

With so many developers and such long build cycles, there is a narrow window to check-in. There where times when, we could not check-in for one week! Its an unbelievable situation, madness actually, a weeks work that is not checked in, A weeks code without any versioning. Since we cannot commit to SVN without making our changes public, its impossible to maintain the history of a file at every logical point.

No offline or local commits is a single most important problem with SVN.

What are the alternatives?

Some teams started branching out from trunk. The plan was, to do the feature development in this new SVN branch and then merge it with trunk at the end of feature development. They solved one problem, they could commit as often as they wanted, without impacting other developers outside their team. But this had given birth to a bigger problem. The problem of Merge. They still had to merge their code (lying in SVN branch) with the trunk.

Other teams were still working out of trunk. Hence, their SVN branch is getting outdated every minute. At the end of a week, when they try to merge the code into trunk, they experienced, what I call, hell on earth.

When we diverge two SVN branches and try to merge them, subversion tries to figure out what has changed and fails. End result, SVN shows a lot of merge conflicts. These are not really conflicts but places where SVN failed to figure out what was wrong.

I have seen people doing the merges for two straight days till 0200 hrs, after which they think, they are in a good state to check-in, only to realize that they have broken the build.

Well, you might find it unreal and exaggerated, but this has happened and its a sad situation to be in.

Enough of SVN bitching, in the next post I will show how GIT helped us to get around all these problems.