Dagstuhl Seminar 12071
Software Clone Management Towards Industrial Application
( Feb 12 – Feb 17, 2012 )
Permalink
Organizers
- Ira D. Baxter (Semantic Designs - Austin, US)
- Michael Conradt (Google - München, DE)
- James R. Cordy (Queen's University - Kingston, CA)
- Stan Jarzabek (National University of Singapore, SG)
- Rainer Koschke (Universität Bremen, DE)
Contact
- Susanne Bach-Bernhard (for administrative matters)
Software clones are identical or similar pieces of code or design. They are often a result of copying and pasting as an act of ad-hoc reuse by programmers. Software clone research is of high relevance for software engineering research and practice today. Several studies have shown that there is a high degree of redundancy in software both in industrial and open-source systems. This redundancy bears the risk of update anomalies and increased maintenance effort.
Many techniques exist that try to detect clones. Some of them are already available in open-source (e.g., PMD) as well as commercial tools (e.g., CloneDr). There are also lines of research in clone detection that evaluate these approaches, reason about ways to remove clones, assess the effect of clones on maintainability, track their evolution, and investigate root causes of clones. Today, research in software clones is an established field with more than 100 publications in various conferences and journals.
The purpose of this seminar was to solidify and give shape to this research area and community. Unlike previous similar events, this Dagstuhl seminar put a particular emphasis on industrial application of software clone management methods and tools and aimed at gathering concrete usage scenarios of clone management in industry, which will help to identify new industrially relevant aspects in order to shape the future research. Research in software clones is very close to industrial application. Among other things, we focused on issues of industrial adoption of our methods and tools.
To achieve our goals, we invited many participants from industry. We managed to reach a percentage of about 30\,\% industrial participation. Talks were given mostly by industrial participants who shared their experiences with us and posed their problem statements. Academic participants were allowed to give a talk if their talk had a clear focus on industrial experiences, needs, problems, and applications of software clone management and related research fields. The focus, however, was on interaction in form of plenary discussions and smaller working groups. The topics for workings groups were gathered by clustering issues the participants wanted to discuss at the seminar. The seminar wiki was used intensively to record the results of the working groups. This agile format was very much appreciated by the participants.
For the remainder of the report, it is important to know the following current categorization of clones:
- Type-1 clone: Identical fragments only.
- Type-2 clone: Lexically identical fragments except for variations in identifiers, literals, types, whitespace, layout, and comments
- Type-3 clone: Gapped clones, that is, clones where statements have been added, removed, or modified.
- Type-4 clone: Semantic clones, that is, clones with similar semantics but different implementations in code.
- Hamid Abdul Basit (LUMS - Lahore, PK)
- Ira D. Baxter (Semantic Designs - Austin, US) [dblp]
- Saman Bazrafshan (Universität Bremen, DE)
- Michel Chilowicz (University Paris-Est - Marne-la-Vallée, FR)
- Michael Conradt (Google - München, DE)
- James R. Cordy (Queen's University - Kingston, CA) [dblp]
- Yingnong Dang (Microsoft Research - Beijing, CN)
- Serge Demeyer (University of Antwerp, BE) [dblp]
- Stephan Diehl (Universität Trier, DE) [dblp]
- Daniel M. German (University of Victoria, CA) [dblp]
- Nils Göde (CQSE GmbH - Garching, DE)
- Michael W. Godfrey (University of Waterloo, CA) [dblp]
- Jan Harder (Universität Bremen, DE)
- Armijn Hemel (GPL Violations Project, NL)
- Elmar Jürgens (CQSE GmbH - Garching, DE)
- Cory J. Kapser (Calgary, Alberta, CA)
- Jindae Kim (HKUST - Kowloon, HK)
- Rainer Koschke (Universität Bremen, DE) [dblp]
- Jens Krinke (University College London, GB) [dblp]
- Thierry Lavoie (Polytechnique Montreal, CA)
- Angela Lozano (University of Louvain, BE)
- Douglas Martin (Queen's University - Kingston, CA)
- Ravindra Naik (Tata Consultancy Services - Pune, IN)
- Jochen Quante (Robert Bosch GmbH - Stuttgart, DE)
- Martin Robillard (McGill University - Montreal, CA) [dblp]
- Sandro Schulze (Universität Magdeburg, DE) [dblp]
- Niko Schwarz (Universität Bern, CH)
- Werner Teppe (Amadeus Germany GmbH, DE)
- Rebecca Tiarks (Universität Bremen, DE)
- Gunther Vogel (Robert Bosch GmbH - Stuttgart, DE)
- Andrew Walenstein (University of Louisiana at Lafayette, US)
- Minhaz Zibran (University of Saskatchewan - Saskatoon, CA)
Classification
- Sw-engineering
- Distribution Maintenance and Enhancement
- Reusable Software
- Hardware/Software Protection
Keywords
- Software clones
- code redundancy
- clone detection
- redundancy removal
- software refactoring
- software reengineering
- plagiarism detection
- copyright infringement
- source differencing