Observations from a Large Scale Agile Project

agilityWell, I’ve been living and working in Bristol 3-4 days a week since last October (2013) and working on a pretty interesting and pretty intense project.

I’m the technical lead on a large scale Agile transformation project that will be used in 22’000 locations across the UK by 60’000 people, with 40 million processes completed per year. It’ll be developed using Open Source technologies and will align with the Government Digital Services assessment criteria.

There are 3 multi-disciplinary scrum teams with around 12 people in each team; and thats just for the software development side of things alone. In total, there’s around 150-200 people on the programme, including a number of additional software and infrastructure suppliers who are looking after software assurance and devops.

Some key observations:

  • Communication is key – with this many people, it’s important that information is shared quickly and accurately. Scrum of Scrums is a decent forum, but I’d also recommend that the multi-disciplinary stand-ups are used too, i.e. Tech leads, UX, BA etc. Share information here across the disciplines, and have it relayed back to the teams and back to ‘management’. People who feel engaged in the process are happier, and it has the bonus effect of both killing rumours and moving progress in the same direction across all teams.
  • Governance of Shared Services – On this programme, there are 3 other parallel projects. Each of these other projects have their own requirements for Authentication / Authorisation, Integration, Document Generation etc. It’s very easy to think ‘lets do it once and share across projects’, but suddenly you are confronted with the functional and non-functional requirements of all these other projects to consider. Don’t under-estimate it; the governance alone is a full time job and it needs customer ownership.
  • Developers argue – Everyone can and often will have a different opinion and sometimes it can get heated. The important point is to listen to all sides and share the concerns. If you don’t lance that boil, it can go septic. This is especially important when you have a multi-supplier delivery team. Have a session across teams to look at best coding practices and share them. Do this frequently.
  • Open Source has it’s problems – What version of MySQL should you use? Oracle Classic or Enterprise, what about the DB Engine? Percona? TokaDB? How’s it covered with licensing? What about the Jasper reports? Can the community edition be automated with Puppet? Start considering early.
  • Build environments early – Your environment code (i.e. Puppet configs) should be centralised and managed under source control. Every single environment should be built from this repository programatically rather than manually. If your scripts get out of line, then your environments get out of line. No-one wants a Dev and Build server that are different. Your environments should evolve in the same way as your application – this means destroy and rebuild per sprint. Doing this ensures that  you are thoroughly testing everything. Make sure your infrastructure people/suppliers are doing this.
  • Performance Test early  – Performance testing needs to be in-sprint. You need the feedback to come back to you for prioritisation for future work. After all, it’s part of the Agile Testing Quadrant.
  • Give People Opportunities – Get a second-in-command. This ensures that you aren’t a single point of failure, and it helps develop them as well. Roll this approach out through the team. Additionally, have good communications directly with the people on your project to find out what they want to do and be, then see what you can do to assist them on that journey. It’s important to make the time to talk to people directly, otherwise you can lose touch with whats actually going at the coal face.
  • Empower and trust your people - Similar to the point above; encourage and develop your teams ability to think for themselves. Better to have a team of individual, intelligent, autonomous individuals that can own decisions, rather than drones that think you have all the answers. However I will say that this approach to be tempered with some governance, to ensure that each team isn’t re-inventing the wheel – communication is key.
  • Review and improve often – It’s important to have retrospectives at each level, be it scrum team, tech leads, UX etc. Look for the rough edges in your processes and see what can be done to streamline things. For example, we had a failing build because all the teams where committing at the same time – a simple thing like having the tech leads own the commit sorted it. Green Build.
  • Don’t assume it’ll get picked up – Self organising teams are fantastic, but when you’re dealing with passing information over to other suppliers, it’s best to ensure that there’s an owner that can pull things together, at least for the initial few iterations. Pull together a quick and basic checklist to ensure things don’t get missed when handing off software to another supplier.
  • Agile vs Waterfall – It’ll happen. Despite the best intentions of Agile communication, you will inevitably hit a part of your project or programme that is waterfall based. For example, your software development might be a lean, mean, agile sprinting machine, but you cant really roll out to 22’000 locations each sprint. It’s important to align your sprints with the wider programme deadlines. Beware of agile frameworks as well – I’m not convinced on that front.

Davey

New Year, New Role, New Tech

Open SourceSo, 2014 is here, and it’s been here for nearly 3 months. I’ve been quiet, but thats because I’m now working on a new project based in Bristol. Strangely it’s a million miles from the tech that I normally work with, i.e. principally .NET.

This new project is all open source. We’re talking PHP, MySQL, Apache, Linux, Java, Vagrant, Puppet, shell scripts, Mongodb, Memcached, SAML and potentially cool things like yubikeys. Additionally it’s Agile based and I’ve a big whiteboard with all my tech stories on it. Class stuff.

I cant go into details on the project, except to say that it’s a big one for me and I’m absolutely loving it. This blog will change over the next few weeks or so as I adopt a new colour scheme and theme.

Anyway, it’s good to be back on here. Took me a while as I’ve moved to using a mac and haven’t a clue what tools to use to get access again :-)

Long time updating!

Hey all,

So my blog has been neglected for a few months now (5! – wow, sorry). I do have a good excuse though. Well a couple actually.

Excuse #1!

So during the summer, I organised a coding camp for teenagers. It’s called Kainos CodeCamp and it was all about teaching young people what a career in IT is like, and also to bridge the gap between ICT (Spreadsheets, Word, PowerPoint, depression) and real software development. I even ended up talking about it on BBC news :-)

Go check out the site for yourself, or watch the highlight video here

 

Excuse #2!

The second excuse is that a project I’ve been working on for 2+ years went into UAT. I can’t go into too much detail (obviously), but suffice to say that it’s a Claims and Policy management system built upon Dynamics CRM 2011 with MVC4, Web API, Knockout JS, Twitter Bootstrap and other cool things. It even has Fuzzy matching in the for fraud detection. I’ve been exceptionally busy with that, as the team had 22 people on it at one point.

So what next?

Exciting times are ahead. I’ll be involved in some Government work using new a new tech stack. I’m also switching to using a Mac, which will be pretty cool for me.

TechEd Madrid – Day 3

Security
I’m going off the standard track for this post, as the whole session update thing on TechEd is a bit monotonous and besides, you can view the cool ones yourselves on Channel 9. What I want to do instead is share my thoughts on something that has been floating about in my head from all the sessions I’ve attended, namely ‘Security’.

Non-Functional Requirements
One of my roles as a Solution Architect is to look at the Non-Functional Requirements of a particular system I’mscalability designing. What’s a Non-Functional Requirement (NFR)? Well, it’s the bits of the system that don’t directly do something – as those would be Functional Requirements (for example, think of  a login page or password reminder page as a piece of functionality that needs to be implemented).

A Non-Functional Requirement complements the Functional Requirement and basically applies boundaries to what the system should and what it shouldn’t do. NFR’s cover areas such as system availability, resiliency, performance, capacity, scalability, accessibility, for example.

I like to ask potential candidates which of these they think is most important to a system they’ve designed, or are going to design, as it helps me see if they have considered how a system will evolve or if it is robust enough to cope with user demands. Usually (but not always) Non-Functional Requirements are treated pretty much equal to each other. If the system isn’t available, the customer could lose money, and we could be in breach of SLA’s. If the system isn’t performing, the customer could lose money, the users would get angry and business could be lost if orders aren’t placed in time. If it doesn’t scale….you get the picture.

However, I don’t hold this ‘equality’ view anymore. I believe security is by far the most important Non-Functional Requirement a system can have. Why? Well, if an attacker can compromise your customers system, they have the means to disrupt daily work, anger users, delay orders, damage reputation etc as per any of the other NFR’s mentioned above, and this alone is bad enough. What’s worse though is that they also have the ability to steal the intellectual property that your customer has spent years and huge sums of money developing. This is much, much worse.

Plans, schematics, bids, quotes, source code, employee records, credit card information, medical information, voting records, tenders, contracts, customer databases, sales contacts, pharmaceutical formulas, criminal convictions – could all be gone in an instant. Once its out there, you’ve no chance of getting it back.

A single breach could effectively be a killing blow to the customer and your reputation as a service provider. The scary thing is that it’s getting easier to do; even scarier is that in some cases it’s state funded. Typically these attacks are focused at successful small to medium enterprises as well, as they haven’t the infrastructure or policies in place to implement proper security procedures and they are taking on a lot of new staff who aren’t yet familiar with how things should be done. Sound familiar?

Hashes

I watched demo’s today whereby Lateral movement and Privilege escalation were demonstrated. What does this mean? Well, essentially it means using a low level account to move along a network from machine to machine and get access to another machine with the same account (Lateral Movement). You can then pull from memory any password hashes that are left by say, a recently logged in Domain Admin and using these, escalate your credentials (Privilege Escalation)

What’s a password hash? Well basically it’s a one-way transformation of your password into a token or series of characters. If you use the same password each time, you end up with exactly the same hash. It’s a one way transformation in that you can’t reverse the hash to get the password. If you change the password, even by one letter, you get a totally different hash. 

Why use them? Well, a lot of operating systems use this approach for authentication rather than sending passwords about the network, as it’s very quick to hash the password and send that – which is sensible. However, it’s totally open to attack using tools such as Windows Credential Editor (WCE).

I watched today as the presenters demonstrated using WCE and USB infiltration devices like Rubber Ducky to compromise a system, simply by inserting it into a USB port. Within 3-7 seconds (slowed down for the demo), it had installed an number of back door passwords, executed a dump from memory (using WCE) of the hashed credentials on the machine and uploaded them to a website. Three to seven seconds.

The tools are becoming more advanced, and as systems become more complicated we have more surface areas to attack. The weakest area of all these attack vectors however these however is us. You and me.

You see, we are after all only human. So many sites, so many passwords. It’s easy to use the same one for all of them right? What about us as developers? You run Visual Studio as Administrator on your dev machine yeah? Turn off User Account Control? What about contractors? How much access do they have to your system? Can users install their own software? Bring in their own USB devices? 

What can be done?

It’s an absolutely massive challenge that cover so many areas, but some key points are:

This is a really good article on the anatomy of an attack. Definitely worth a read.

TechEd Madrid – Day 2

Definitely feeling better today. Maybe the 10hrs sleeping did the trick. Anyway, after a fun time this morning getting a taxi to the event (shared one with other TechEdians), I kicked off the day with the following session:

Deep Dive into the Team Foundation Server Agile Planning Tools

TFS
Yes, I know we all ‘hate’ MS and new open source technology is just so ‘cool’ sarc but I figured it’s always good to see what’s being done. Lets not forget that MS keeps a very close eye on technology out there and adapts, buys, copies, acquires, innovates or invents as it sees fit.

The main points are that it has a Kanban board built in, along with team communication. You have support for 5 levels ranging from tasks, stories, features, initiatives and goals, with each level taking on a higher view of the levels below. These groupings can be named how you like.

Some of the things I liked about the product were that:

  • You can set a work in progress limit for a particular list, something that I miss from Trello.
  • Each team can operate with their own board with their own list names. MS has been very clever in this as they have applied a state engine to the cards on the board, so even though each team have their own list setup, the cards move through an underlying state that’s common to the project. This gives you the ability to identify work progress across all teams using the common ‘state’ language.
  • It definitely needs work, but the team know this. For example, you cannot create a card on the board directly or re-order cards, but this will come soon.

Given that TFS now works with Git and Azure, I think it’s time to do a bit more research internally to determine how it sits with us in Kainos. Anyone want to get involved?

The Inside Man: Surviving the Ultimate Cyber Threat

20130626_101840

My next session was by the always entertaining Andy Malone, who’s an MVP for enterprise security. This is the first of 2 sessions on security I attended today. Here’s a link to the documents Andy shared – some great stuff here. Some key points:

20130626_103221

  • 84% of attacks come from the inside
  • Social engineering techniques are continually evolving
  • China has 330’000 operatives in Computer Security
  • Video
    • Video showed a German concrete manufacturer who discovered that a Chinese business ‘partner’ has a concealed camera. 4 years of work could have been easily stolen.
  • It’s seldom a sudden impulse to steal. It builds over time
  • Reasons can be: revenge, excitement, temptation (sex, money, etc), coercion, gullibility.
  • People involved aren’t crazy, but are usually anti-social / narcissistic
  • ‘We Work for money – if you want my loyalty, buy a dog’
  • Proven ‘30’ day window after an event when data / systems abuse occurs.
  • Mitigate the risk; watch out for disgruntled employees through reviews and feedback. Catch the problem before it develops.

20130626_112023 
Andy then went on to show show scary tools and devices, such as using Google to do a deep search for ‘Membership List.xls’ and getting back names, emails, job titles and phone numbers. This was then followed up by using sites such as Pipl, WayBackMachine, Infomine, Cirt (for all default passwords! and data mining tools such as FOCA and Paterva (very scary tools that can tell you a lot about OS’s, Printers, Users, Software etc).

Capture
One thing that I think however that should become the norm with users in general is to purge metadata from documents before uploading them to the net. It makes sense to close down any potential avenue for social engineering attacks.

1984: 21st Century Surveillance vs.. the Erosion of Freedom

This was a lunchtime session in the same hall so I just hung about for it. It’s another session by Andy Malone and focused on recent developments on things like PRISM. Andy, being the MVP for Enterprise Security has been involved with the military and government (UK and US), so it was definitely interesting to listen to. I think a picture paints a thousand words with this one, so have a look at the pictures below:

20130626_124318

This thing is real.

20130626_124550

Andy talked a lot about access. Basically Governments have it, no matter what they say. The infrastructure is there, and the likes of GCHQ have a direct connection to the cables. Data is apparently held for 30 days, but there are plans to apparently up this to 12 months.

With regard to PRISM, it’s essentially a massive data mining tool created by SISense, an Israeli company. Here’s what it looks like (Nice UI! Bet the NSA doesn’t have the ‘share on FB & Twitter links though). The example below is for disaster relief, and it uses standard ‘click to drill down’ interface. You can just imagine how this would work with keywords, watch words, phone call meta data etc.

image 

One of the other things Andy showed us was details on the TOR network. I’ve heard of TOR but never ventured onto it. I doubt I will – there’s some seriously dodgy things on there like purchasing guns, selling bank and credit card details, buying drugs and even arranging a hit. The advice given was to stay off it. One thing that was kind of cool though was the hidden wiki. If you browse to it online, just just get details on what it is, not what it contains. You have to access it via TOR to see the content. Note that security and police services access these sites and set up honey traps, so be aware.

How Many Coffee’s can you drink while your PC Starts

wpt

So I decided to go with my gut and walk into this session as it was in one of the bigger halls. Really glad I did, as it’s basically about how to improve the boot speed for Windows 7 & 8 and the presenter, Pieter Wigleven was very funny. Maybe not something I would do everyday, but knowing what to look for will be very useful to me, and no doubt to you too.

Peter ran through some scenarios whereby customers have machines that take > 1hr to boot. They actually had a rota for someone to come into work earlier and turn on all the machines before everyone else arrived!

First things first, you need the Windows 8 ADK. Once you get that, you just select the Windows Performance toolkit. This toolkit gives you a great UI to see what’s talking so long during booting. Some key points, that probably apply to systems members more than anyone else:

  • Use Group Policy to enforce settings. Get rid of old cmd scripts / VBS scripts.
  • Review the list of starting apps frequently.
  • Schedule start-up tasks to run when the system is idle, using schedule task manager
  • Disk means everything, MS tested 30’000 internal machines. Looking at the cost involved for the wait (something like boot delay * boots per day * working days per year * #machines * employee rate) justified the business case for every single MS  machine to use SSD’s. When you think about it like that, it makes total sense.

Developing Core Business Applications with Domain Driven Design

DDD

By this stage in the day (+remnants of food poisoning) I was flagging. Stayed to the end though but found this talk quite dry. The main points coming out from it are as follows:

  • DDD is now viewed as trendy, though it’s been about for 10+ years
  • Complexity can either be accidental or essential. Essential is complexity determined by the domain, accidental is determine by the use of frameworks and tools.
  • Productivity drops as complexity rises.

complexity

  • Size of code drives cost, bugs and delays. What typically happens with delays? Add more people. More people = more code. See the point above!
  • DDD is especially for complex software. Focus on the domain. Always.
  • 4 parts that feed into the model
    • UML as a sketch, typically on a whiteboard
    • Ubiquitous language – use common terms between domain and technical teams
    • Code
    • UI
  • Use tools like specflow to ease into Behaviour Driven Development

For more info on the subject, have a look at the actual recording of this session. MS were quick. I assume they have the rest of them up as well, so be sure to check out Channel 9

TechEd Madrid – Day 1

A year has passed since I attended TechEd Amsterdam and it reaIFEMA_(Campo_de_las_Naciones)_01lly doesn’t seem like it. This year it’s in Madrid and it’s being hosted at the IFEMA centre. I was fortunate enough to get over a couple of days beforehand with my wife to take in the sites and sounds.

I never realised how commercial a place Madrid is. If you’re into your shopping, this is the place to come. It is very hot though (~34c, which doesn’t suit someone like me – hardly any hair, with definite elements of ginger in the hair that does exist). We did manage to take a day trip out to Toledo, which is a must if your get the chance. Amazing place. Also, I learned never to make eye contact in Madrid train station toilets, as I was subsequently ‘felt up’ whilst drying my hands. Be warned.

One the down side however, I’ve picked up a bout of food poisoning and unfortunately spent last night retching.  Anyway, I made it in today (cold shakes and all) and caught most of the Keynote (available here). Apologies for these initial blog entries, I am feeling very dodgy today!

Keynote

20130625_093346

  • Big news is that you no longer get charged for a powered down VM on Azure, and it’s switched to a bill by the minute model. This is great, as I was using EC2 and had a lot of phantom charges added to my account. Time to give Azure another go!
  • If your an MSDN subscriber, you get a lot of free VM hours on Azure. I’ll need to check this out to see if Kainos qualify.
  • Big news is that SQL Server 2014 is available for preview download. Built to focus on the cloud, this technology really improves on performance. You can specify ‘hot’ tables which run in memory, DB latching performance impact is removed as the CPU doesn’t have to wait, and Stored Procs get compiled to a native code dll. All this can be done without changing your app. MS are really investing in the in-memory model, which is great as SSAS is old technology and new contenders like Qlikview put it to shame.
  • VS2013 is released later this year and (yes I’m saying this) Team Foundation Agile process management looks great. It has a virtual Kanban board with post-its and persistent chat channels, things we typically use other tools such as Trello and Skype for. It also has support for epics across teams.
  • Data is a big theme of the event. Some figures banded about are that we’re looking at a 60% increase in data year-on-year.  Crazy.
  • In one segment, the talk was on SQL Server 2014 (more on this later) but what caught my eye was Geoflow. Geoflow is an add-in to Excel 2013 which looks fantastic. It’s basically a Google earth style globe that you can plot on using data in your Excel sheets. Datasets can be overlaid and drilled down upon and you can have time-bound heat maps that show you how events progress.

20130625_100512

Big Data, Small Data, All Data

This session is obviously all about data, and it focused on 3 areas, namely:

  1. Collate
  2. Analyse
  3. Act

The three areas above were played out using a demo for BlueYonder Airlines, a fictional airline demo built for MS. The demo walked us through a scenario whereby we wanted to measure customer satisfaction over a period of time. The presenter loaded data into hadoop (Azure HD Insight) from 4 locations, namely an Excel sheet with airport data, tweets from Twitter, weather data from an open source weather site and mobile data captured from logs of BlueYonder mobile apps.

The above formed the collation part of the demo, and the cool thing I found about it is the use of Polybase technology, which essentially underpins the loading of data from various disparate sources and connects SQL server to Hadoop. You essentially leave your data where it is and use the External and Location keywords to get SQL server to point to your Hadoop cluster in Azure.

The Analysis scenario used Excel, which was to be expected, and indeed used the Geoflow tool that I mentioned earlier. Again heat maps were used to show key events in the timeline and helped to focus on a specific date when a lot of customer dissatisfaction was recorded.

20130625_114611

The ‘Act’ scenario was kinda contrived, but served to show that by obtaining the data from above, it was easy to determine that the current environment struggled with the upload in user access via the mobile apps. Something needed to be done. This was all purely to showcase the new in memory capabilities of SQL Server 2014. This technology has been in work for years, but MS wanted to bring it out when it was ready, hence the short gap between SQL 2012 and 2014.

The core things you can now do with SQL 2014 is identify ‘hot tables’ and stored procedures. There’s a report to identify these, and you can then select to migrate them into memory (for tables) or natively compile the stored procedures. Latching (kinda like page locking) is removed, so there’s no CPU waiting. All this uses existing hardware and it’s hard not to be impressed at the performance increases (10x – 30x). Additionally, SQL 2014 has the ability to generate DR and fault tolerance on Azure for you (though you have to ask yourself whether that’s a good thing given the current NSA revelations)

Microsoft ASP.NET, Web and Cloud Preview

This one is really just a roadmap for where .NET is going. Probably most easily explained with the following picture:

20130625_134538 
Some key points:

  • New membership system. Uses Claims based Auth.
  • Bootstrap JS now key part of templates
  • No more crazy project items. One project. Scaffolding support for all flavours of .NET
  • EF has a built in ‘re-connect’ option for cloud based scenarios. I really like this as it aids resiliency.
  • Codename ‘Artery’ (I think) is one of the coolest things I saw. It is built in SignalR technology and essentially when you F5 debug from VS2013 it keeps a lock on the browser (or browsers, including non-IE ones) you open. If you update your project HTML and save, every single connected browser is automatically refreshed! Even cooler however is the news that in future updates, if you use for example, Chrome tools to update your HTML, they can push that change back from the browser to VS. Very cool.
  • VS2013’s HTML Editor will be updated as well. JS will be a first class citizen and you’ll be able to jump to functions and do the cool editor tricks that the likes of sublime or notepad++ give you (like updating specific sections of divs/lists etc without affecting anything else)

Introduction to Azure Active Directory (AAD)

I’ve played about with this in the past and wanted to see if there’d been any developments to it.  It’s being pitched as IDAAS (Identify as a Service). If you have Office 365, you already have AAD. Some of the key things to take away from this session is that MS already have 3 MILLION tenants on AAD and handled 7 BILLION authentications this week alone. Looks like its a tried and trusted technology.

What I found interesting is that MS recently acquired PhoneFactor and this has made it’s way into Azure Active Directory. No view as yet whether this is a free service or whether 2FA will have it’s own cost. I suspect the latter.

20130625_155649 
Advanced Debugging of ASP.NET Apps using VS2012

I finished off the day with this session on debugging. Interesting enough but I have to be honest and say I bailed when he started talking about breakpoints. That, and due to the fact that I hadn’t eaten anything due to feeling so grim the night before. On the mend now and gearing up for more TechEd tomorrow! One of the bonuses is that I just bought a MS Surface RT for £69; reduced from £459. Bonus for attendees!