Requirements Programmer: May 2013

Monday, May 27, 2013

The Engineering Awesomeness of F1

... or what software engineers could learn with F1

See that smoke behind the car? Not good...

For a long time, the main requirement for Formula 1 racing was power. Everything in the car but the pilot would last for only one race, if much - very often engines or gearboxes would blow up in the middle of a race. Indeed, within the same Grand Prix (i.e., practice sessions, qualifying and the race itself) different engines could be used.

But, of course, it was not all about engines. Every team updated their cars during the season, changing aerodynamic parts, suspension, breaks, etc. And how did they know whether an update designed for the car was really working? Unlimited testing. Even so I suppose it was still difficult to measure evolution due to environmental conditions (going from track grip and temperature to evolution of the driver himself). Nonetheless, with enough mileage, statistics was a friend.

But then, in the last decade, everything changed. Let's talk about F1 challenges for engineering in 2013.

Only 8 engines can be used by a driver throughout the entire year, which means an average of more than 2 Grand Prix (GP) per engine. Gearboxes must last at least 5 GP. We can say that reliability became a main concern. Refueling is not allowed during races, thus low fuel consumption is a must. Power, of course, is still important. Driveability, which is the engine's response to drivers input (through the throttle pedal), can be changed from race to race. What about space? The smaller the engine, the better the car can be designed around it. Oh, that remembers me of weight, which has big effects on the car's performance. Now, that's a good looking list of non-functional requirements, right? All of this in a very competitive environment where thousandths of a second can make a big difference.

I think it's fair to say that F1 is the place where trade-offs on non-functional requirements are the most visible. This constant trade-off between performance, reliability and packaging in its different measures is present in the design of all elements of the car, including engines, gearboxes, brakes, KERS, and suspension. All of this within strict regulations that define maximum/minimal size and weight of some specific components, and at the same time exploring the breaches in the regulation to create innovative solutions and gain an edge against competition. After design, there is car set-up, changed on every race, seeking for the best compromise of qualifying performance, race pace, tyre consumption, heavy-load pace, light-load pace and pit-stop strategy. Lastly, after the initial design, and in addition to set-up changes, we go back to evolution, in terms of updates to the car.

If in the not-so-old days the teams had unlimited testing, allowing to evaluate their updates, nowadays it is very, very limited. It consists pretty much of a few collective pre-season testing sessions, 4-days each, plus 4 days of straight-line tests for each team. I'm sad for the drivers - after all even in programming, which is far more theoretical than race-driving, it is difficult to improve without practicing. However, from the engineering point of view, this is just awesome. What teams do in order to compensate this lack of on-track testing?

Computer simulation - Now the teams rely a hell of a lot more on computer models to understand the effect of changes in the car. As aerodynamics are so important for performance, a key factor here is computational fluid dynamics (CFD). But not just that, top teams also develop simulators which probably are the most realistic racing video-games ever made.
Wind tunnels - this is when you build a scale model of your car, take it to a tunnel with a huge fan, and see how the air flows through and around the car. The limitation here is that the scale model has to be at most 60% of the real size, while the maximum speed is 180km/h. Compare that with last Spanish GP, where the maximum speed during qualifying was 318.5 km/h. The more marginal the gains to be obtained, the more important is the precision of the models.

Due to these limitations, the last resort is to evaluate updates during the GP weekend, in the...
Practice sessions - In a GP weekend, the qualifying session is preceded by 3 free-practice sessions. Usually, 2 sessions of 1.5-hour on Friday, plus a 1-hour session on Saturday. It amounts to 4 hours of on track action, which used to serve to acclimatize drivers to the track and to adjust car set-ups. Now, they also use it to evaluate new parts for the car, specially during the first session. So, the testing is very time-constrained, in a track shared with all the other cars, tyres degrade really fast, and, you know, changing some parts of a car can be time-consuming as well. Oh, and don't forget that rain can ruin the engineers attempt to evaluate updates.

What looks like green paint is used to check air flow

With so many constraints, it's incredible that they are actually able to improve a lot throughout the year. Nonetheless, some top teams eventually faced big problems - in 2012, it was Ferrari; now, 2013, it is McLaren. The guilty is only one: correlation. This is similar to Brian Smith's discussion about computing on models and acting on the real world, which (as he argues) explains why proving correctness of a software does not give us any guarantee that it will behave correctly once it's out there in the real world. So, designers make models and simulations and find out that some changes will make the car behave in a specific way, but when it hits the track, they eventually find out that it's not the case. Then, it doesn't matter what 'virtual' improvements are made, since they will not reflect on actual improvements. When this happens, what is really important is to fix the correlation issue, i.e., to ensure that the model being used corresponds to the real world.

Then, there is the last F1 aspect I'd like to discuss (for now), which is all about Big Data. F1 cars feature hundreds of sensors, which send real time data about the car, from brakes temperatures to lateral acceleration and component failures. This is used during a race, for instance, to identify problems with the car (and possibly fix it), to inform the driver where (and how) he can improve, to decide when to pit-stop, and to estimate target times for each lap. For these last two it is also required to have information about the other cars on track.

Additional sensors behind the left tyre; these are not used in races.

Besides all these challenges, there are the more specific aspects of automotive technologies, such as combustion, energy recovery, materials, and so on. Anyway, summing up just the challenges we discussed, we have:

Difficult compromises between non-functional requirements, in a very competitive environment, with strict technical regulations;
Lack of on-track testing forcing the teams to rely on models and simulations;
Gathering and processing big data in real time.

Pretty awesome, right? I bet we, software engineers, could learn a bit or two with these guys.

Images copyright: Ferrari with blown engine, from http://www.dailymail.co.uk. Mercedes with sensor, from http://www.crash.net. Red Bull with flow viz, from http://www.auto123.com.

Monday, May 20, 2013

Highlights from ICSE 2013

Last weekend was the beginning of the 35th International Conference on Software Engineering (a.k.a., ICSE'13). Here are my comments on some of the papers that will be presented in the main track:

Beyond Boolean Product-Line Model Checking: Dealing with Feature Attributes and Multi-features
Analysis of User Comments: An Approach for Software Requirements Evolution
What Good Are Strong Specifications?
UML in Practice
GuideArch: Guiding the Exploration of Architectural Solution Space under Uncertainty

By the way, "highlights" are by no means "best papers", just a selection based on my personal interests. Check out the proceedings to see other good papers within your field.

Beyond Boolean Product-Line Model Checking: Dealing with Feature Attributes and Multi-features
by Maxime Cordy, Pierre-Yves Schobbens, Patrick Heymans and Axel Legay

This paper is about two important concerns for variability models: numerical attributes and multi-features. Usually features can be seen as boolean: either they're there (in the product), or they're not. Numerical attributes, on the other hand, are... numerical. I mean, it's not just a matter of whether they're there or not, but instead of what is their value. Examples from the paper are timeout and message size (in a communication protocol). The other thing is multi-features, which are features that may have multiple instances. E.g., think about a smart house, which may NOT have a video camera, may DO have a camera, and may also have MULTIPLE cameras (each with their own configuration).

What I really liked on this paper is that it describes the ambiguities that arise when defining multi-features with children features, with respect to group cardinalities and feature cardinalities. It then goes on to define precise semantics that prevent this ambiguity. It also handles the efficiency problem of reasoning with numeric attributes in variability models, by mapping them to sets of boolean variables. So they have two versions of their model-checking tool: one using a Satisfiability Modulo Theory (SMT) solver, which handles numeric attributes but inefficiently; and another one which convert them to boolean attributes, which is more efficient but also more limited.

Another good quality, from the writing point of view, is that instead of jumping straight to the underlying formalism (which, due to space limitations, is always a tempting option) and relying on the reader to decode it, the paper provides clear, precise definitions of the problem and of the solution in plain old english text (POET? Is that a thing?)

This work addresses some of the problems we're facing extending a framework on requirements for adaptive systems; hopefully we can borrow some of these ideas to improve our own work.

Analysis of User Comments: An Approach for Software Requirements Evolution
by Laura V. Galvis Carreño and Kristina Winbladh

We all know the importance of user feedback for improving a product. If before the use of Internet for distributing software the problem was "how can we get this feedback?", the problem now is "how can we go through the sheer amount of feedback available and capture what is useful?" The idea in this paper is to automatically extract the relevant information from said feedback, providing a concise report which will be an input for requirements analysis. This report include the topics mentioned in the feedback and excerpts related to that topic.

The evaluation of this kind of approach is quite tricky. They compared results from the automatic extraction with results from manual extraction. The trade-off here is that the manual extraction was done by the same people that developed the approach, thus there is some bias as their heuristics and criteria were somehow embedded in the automatic extraction. On the other hand, if the manual extraction was done by others, perhaps it would not be feasible to compare the two results. Said that, I think the authors did a pretty good job with the evaluation.

Talking about evaluation, it would be good if they make available their (complete) data set, as well as their implementation, to allow the replication of the experiment. Also, I wish I had a popular product of my own on which to test this approach ;)

What Good Are Strong Specifications?
by Nadia Polikarpova, Carlo A. Furia, Yu Pei, Yi Wei, and Bertrand Meyer

There is some prejudice against formal methods. Been there, done that. By the way, take a look on what Dijkstra had to say about this ("On the Economy of doing Mathematics"). Nevertheless, practitioners have been very keen to jump into the boat of test-driven development and design-by-contract (omg, was this all a worldwide plot to gradually make people unknowingly adopt formal methods? If so, hats up!). Anyway, this paper describes these approaches as providing lightweight specifications, and try to find out what would be the comparative benefit of adopting strong (heavyweight?) specifications, with pre-conditions, post-conditions and invariants (which, according to my spell-checker, is not a word).

I like empirical work that challenges "conventional wisdom", independently of whether the results are positive or negative. The result here is that better specifications lead to more faults being identified. What about effort? "These numbers suggest that the specification overhead of MBC is moderate and abundantly paid off by the advantages in terms of errors found and quality of documentation". Well, for me this conclusion was far-fetched. First, the spec/code ratio of the strong specification is more than the double of the lightweight ones. Second, while they compare this ratio with the ratio of other approaches, it boils down to how difficult it is and how long it takes to create those lines, which is (understandably) not compared.

So, my provocative question is: for better specifications we need better techniques/formalisms/technologies, or do we need better education on using what's out there? Moreover, the specification checks the program, but what/who checks the specification?

UML in Practice

by Marian Petre

UML is the de facto standard for software modeling. Or is it? If we look in academic papers, it's usual to see statements in the likes of "we adopted (insert UML diagram here) because it is an industrial standard, largely adopted by practitioners..." (the same goes for BPMN, btw). This paper tries to desmitify that notion, through interviews with 50 software developers (from 50 different companies). The results are... *drumroll*

No UML (35/50)
Retrofit (1/50)
Selective (11/50)
Automated code generation (3/50)
Wholehearted (none)

From 'selective use', the most adopted diagrams are class diagrams (7), sequence diagrams (6) and activity diagrams (also 6). While these results are interesting by themselves, I wish the author had also explicitly identified if they use some other kind of modeling. After all, it may be the case that UML is not largely adopted in general, but largely adopted within the set of people that use some kind of modeling.

Anyway, besides an overview of UML criticisms in the literature, the paper provides more details and discuss the answers from the interviews; it's a must read for anyone designing model languages. A key point is complexity. Sorry, but I have to shout it out loud now: guys, let's keep things simple! Let's create our new approaches/ models/ frameworks/ tools, but then let's consciously work toward simplifying them; we don't want to increase complexity, we want to tame it.

Back to the article, some broad categories of criticism described in the paper are: lack of context, overheads of understanding the notation, and issues of synchronization/consistency (meaning co-evolution of models). I'd like to finish this informal review with a remark from one of the participants of the study, who said his company dropped UML for an "in-house formalism that addresses the ‘whole’ system 'i.e. software, hardware and how they fit together,along with context, requirements (functional and nonfunctional),decisions, risks, etc." - I wish that company would share their solution with us! =)

GuideArch: Guiding the Exploration of Architectural Solution Space under Uncertainty
by Naeem Esfahani, Sam Malek and Kaveh Razavi

To make architectural decisions, we need information that we can only measure precisely later on - e.g., response time, battery consumption. However, the later we change a decision, the costlier it is. This conflict is the subject of this paper. It provides a framework for reasoning with unprecise information, which can be made more precise later on in the project. This unprecise information is the impact of an alternative towards quality properties, being expressed either as numerical range or as enumerations (e.g., low, medium, high). It then uses fuzzy logic to reason on these ranges.

This seems similar to the notions of softgoals and quality constraints in requirements modeling. Softgoals are goals for which there is no clear-cut achievement criteria. To make decisions we can analyze the subjective impact (contribution) of different alternatives onto the softgoals of interest. On the other hand, at some point we may define/assume a refinement of that softgoal (usually a quality constraint), becoming able to make objective decisions.

I particularly like the case study they give with 10 architectural decisions and their alternatives - more details here. They also provide a web-based tool, but I wonder if they used their approach when architecting the tool - after all, which quality property could lead them to use Silverlight?

I'd like to hear your views on these papers, feel yourself invited to continue the discussion in the comments below!

Wednesday, May 15, 2013

Create the thing you want to exist in the world

New post-it for my office =)

Send me your own drawing through the comments and I'll be happy to update the post with your pictures!