Monday, May 20, 2013

Highlights from ICSE 2013

Last weekend was the beginning of the 35th International Conference on Software Engineering (a.k.a., ICSE'13). Here are my comments on some of the papers that will be presented in the main track:

  • Beyond Boolean Product-Line Model Checking: Dealing with Feature Attributes and Multi-features
  • Analysis of User Comments: An Approach for Software Requirements Evolution
  • What Good Are Strong Specifications?
  • UML in Practice
  • GuideArch: Guiding the Exploration of Architectural Solution Space under Uncertainty

By the way, "highlights" are by no means "best papers", just a selection based on my personal interests. Check out the proceedings to see other good papers within your field.



Beyond Boolean Product-Line Model Checking: Dealing with Feature Attributes and Multi-features
by Maxime Cordy, Pierre-Yves Schobbens, Patrick Heymans and Axel Legay

This paper is about two important concerns for variability models: numerical attributes and multi-features. Usually features can be seen as boolean: either they're there (in the product), or they're not. Numerical attributes, on the other hand, are... numerical. I mean, it's not just a matter of whether they're there or not, but instead of what is their value. Examples from the paper are timeout and message size (in a communication protocol). The other thing is multi-features, which are features that may have multiple instances. E.g., think about a smart house, which may NOT have a video camera, may DO have a camera, and may also have MULTIPLE cameras (each with their own configuration).

What I really liked on this paper is that it describes the ambiguities that arise when defining multi-features with children features, with respect to group cardinalities and feature cardinalities. It then goes on to define precise semantics that prevent this ambiguity. It also handles the efficiency problem of reasoning with numeric attributes in variability models, by mapping them to sets of boolean variables. So they have two versions of their model-checking tool: one using a Satisfiability Modulo Theory (SMT) solver, which handles numeric attributes but inefficiently; and another one which convert them to boolean attributes, which is more efficient but also more limited.

Another good quality, from the writing point of view, is that instead of jumping straight to the underlying formalism (which, due to space limitations, is always a tempting option) and relying on the reader to decode it, the paper provides clear, precise definitions of the problem and of the solution in plain old english text (POET? Is that a thing?)

This work addresses some of the problems we're facing extending a framework on requirements for adaptive systems; hopefully we can borrow some of these ideas to improve our own work.


Analysis of User Comments: An Approach for Software Requirements Evolution
by Laura V. Galvis Carreño and Kristina Winbladh

We all know the importance of user feedback for improving a product. If before the use of Internet for distributing software the problem was "how can we get this feedback?", the problem now is "how can we go through the sheer amount of feedback available and capture what is useful?" The idea in this paper is to automatically extract the relevant information from said feedback, providing a concise report which will be an input for requirements analysis. This report include the topics mentioned in the feedback and excerpts related to that topic.

The evaluation of this kind of approach is quite tricky. They compared results from the automatic extraction with results from manual extraction. The trade-off here is that the manual extraction was done by the same people that developed the approach, thus there is some bias as their heuristics and criteria were somehow embedded in the automatic extraction. On the other hand, if the manual extraction was done by others, perhaps it would not be feasible to compare the two results. Said that, I think the authors did a pretty good job with the evaluation.

Talking about evaluation, it would be good if they make available their (complete) data set, as well as their implementation, to allow the replication of the experiment. Also, I wish I had a popular product of my own on which to test this approach ;)


What Good Are Strong Specifications?
by Nadia Polikarpova, Carlo A. Furia, Yu Pei, Yi Wei, and Bertrand Meyer

There is some prejudice against formal methods. Been there, done that. By the way, take a look on what Dijkstra had to say about this ("On the Economy of doing Mathematics"). Nevertheless, practitioners have been very keen to jump into the boat of test-driven development and design-by-contract (omg, was this all a worldwide plot to gradually make people unknowingly adopt formal methods? If so, hats up!). Anyway, this paper describes these approaches as providing lightweight specifications, and try to find out what would be the comparative benefit of adopting strong (heavyweight?) specifications, with pre-conditions, post-conditions and invariants (which, according to my spell-checker, is not a word).

I like empirical work that challenges "conventional wisdom", independently of whether the results are positive or negative. The result here is that better specifications lead to more faults being identified. What about effort? "These numbers suggest that the specification overhead of MBC is moderate and abundantly paid off by the advantages in terms of errors found and quality of documentation". Well, for me this conclusion was far-fetched. First, the spec/code ratio of the strong specification is more than the double of the lightweight ones. Second, while they compare this ratio with the ratio of other approaches, it boils down to how difficult it is and how long it takes to create those lines, which is (understandably) not compared.

So, my provocative question is: for better specifications we need better techniques/formalisms/technologies, or do we need better education on using what's out there? Moreover, the specification checks the program, but what/who checks the specification?



UML in Practice
by Marian Petre

UML is the de facto standard for software modeling. Or is it? If we look in academic papers, it's usual to see statements in the likes of "we adopted (insert UML diagram here) because it is an industrial standard, largely adopted by practitioners..." (the same goes for BPMN, btw). This paper tries to desmitify that notion, through interviews with 50 software developers (from 50 different companies). The results are...  *drumroll*
  • No UML (35/50)
  • Retrofit (1/50)
  • Selective (11/50)
  • Automated code generation (3/50)
  • Wholehearted (none)
From 'selective use', the most adopted diagrams are class diagrams (7), sequence diagrams (6) and activity diagrams (also 6). While these results are interesting by themselves, I wish the author had also explicitly identified if they use some other kind of modeling. After all, it may be the case that UML is not largely adopted in general, but largely adopted within the set of people that use some kind of modeling.

Anyway, besides an overview of UML criticisms in the literature, the paper provides more details and discuss the answers from the interviews; it's a must read for anyone designing model languages. A key point is complexity. Sorry, but I have to shout it out loud now: guys, let's keep things simple! Let's create our new approaches/ models/ frameworks/ tools, but then let's consciously work toward simplifying them; we don't want to increase complexity, we want to tame it.

Back to the article, some broad categories of criticism described in the paper are: lack of context, overheads of understanding the notation, and issues of synchronization/consistency (meaning co-evolution of models). I'd like to finish this informal review with a remark from one of the participants of the study, who said his company dropped UML for an "in-house formalism that addresses the ‘whole’ system 'i.e. software, hardware and how they fit together,along with context, requirements (functional and nonfunctional),decisions, risks, etc." - I wish that company would share their solution with us! =)


GuideArch: Guiding the Exploration of Architectural Solution Space under Uncertainty
by Naeem Esfahani, Sam Malek and Kaveh Razavi

To make architectural decisions, we need information that we can only measure precisely later on - e.g., response time, battery consumption. However, the later we change a decision, the costlier it is. This conflict is the subject of this paper. It provides a framework for reasoning with unprecise information, which can be made more precise later on in the project. This unprecise information is the impact of an alternative towards quality properties, being expressed either as numerical range or as enumerations (e.g., low, medium, high). It then uses fuzzy logic to reason on these ranges.

This seems similar to the notions of softgoals and quality constraints in requirements modeling. Softgoals are goals for which there is no clear-cut achievement criteria. To make decisions we can analyze the subjective impact (contribution) of different alternatives onto the softgoals of interest. On the other hand, at some point we may define/assume a refinement of that softgoal (usually a quality constraint), becoming able to make objective decisions.

I particularly like the case study they give with 10 architectural decisions and their alternatives - more details here. They also provide a web-based tool, but I wonder if they used their approach when architecting the tool - after all, which quality property could lead them to use Silverlight?



I'd like to hear your views on these papers, feel yourself invited to continue the discussion in the comments below!

No comments:

Post a Comment