An Empirical Study on the Use and Misuse of Java 8 Streams,
10 pages (to appear) .
by Raffi Khatchadourian, Yiming Tang, Mehdi Bagherzadeh and Baishakhi Ray.
[FASE 2020]
Poster: Which Similarity Metric to Use for Software Documents? A Study on Information Retrieval-Based Software Engineering Tasks, Md Masudur Rahman, Saikat Chakraborty, Baishakhi Ray. 2 pages. ICSE’18-Poster.
Poster: Searching for High-performing Software Configurationswith Metaheuristic Algorithms, Chong Tang, Kevin Sullivan, Baishakhi Ray. 2 pages. ICSE’18-Poster.
Poster: A Recommender System for Developer Onboarding, Chao Liu, Dan Yang, Xiaohong Zhang, Haibo Hu, Jed Barson, Baishakhi Ray. 2 pages. ICSE’18-Poster.
On the “Naturalness” of Buggy Code,
12 pages, acceptance rate: 19%
by Baishakhi Ray, Vincent Hellendoorn, Saheel Godhane, Zhaopeng Tu, Alberto Bacchelli, Premkumar Devanbu.
[
ICSE 2016]
@inproceedings{ray2015naturalness,
title={On the" Naturalness" of Buggy Code},
author={Ray, Baishakhi and Hellendoorn, Vincent and Tu, Zhaopeng and Nguyen, Connie and Godhane, Saheel and Bacchelli, Alberto and Devanbu, Premkumar},
series = {ICSE '16},
year={2016},
organization={ACM}
}
Real software, the kind working programmers produce by the kLOC to solve
real-world problems, tends to be “natural”, like speech or natural language; it
tends to be highly repetitive and predictable. Researchers have captured this
naturalness of software through statistical models and used them to good effect
in suggestion engines, porting tools, coding standards checkers, and idiom
miners. This suggests that code that appears improbable, or surprising, to a
good statistical language model is “unnatural” in some sense, and thus possibly
suspicious. In this paper, we investigate this hypothesis. We consider a large
corpus of bug fix commits (ca. 7,139), from 10 different Java projects, and
focus on its language statistics, evaluating the naturalness of buggy code and
the corresponding fixes. We find that code with bugs tends to be more entropic
(i.e. unnatural), becoming less so as bugs are fixed. Ordering files for
inspection by their average entropy yields cost-effectiveness scores comparable
to popular defect prediction methods. At a finer granularity, focusing on
highly entropic lines is similar in cost-effectiveness to some well-known
static bug finders (PMD, FindBugs) and ordering warnings from these bug finders
using an entropy measure improves the cost-effectiveness of inspecting code
implicated in warnings. This suggests that entropy may be a valid, simple way
to complement the effectiveness of PMD or FindBugs, and that search-based
bug-fixing methods may benefit from using entropy both for fault-localization
and searching for fixes.
2015
Assert Use in GitHub Projects,
11 pages, acceptance rate: 18.5%
by Casey Casalnuovo, Prem Devanbu, Abilio Oliveira, Vladimir Filkov, Baishakhi Ray.
ICSE 2015
@inproceedings{casey2014Assert,
title={Assert Use in GitHub Projects},
author={Casey, Casalnuovo and Prem, Devanbu and Abilio, Oliveira and Vladimir, Filkov and Ray, Baishakhi},
series = {ICSE '15},
year={2015},
organization={ACM}
}
Assertions in a program are believed to help with automated verification, code
understandability, maintainability, fault localization, and diagnosis, all eventually leading
to better software quality. Using a large dataset of assertions in C and C++ programs, we
confirmed this claim, i.e., methods with assertions do have significantly fewer defects. Assertions
also appear to play a positive role in collaborative software development, where many
programmers are working on the same method. We further characterized assertion usage along
process and product metrics. Such detailed characterization of assertions will help to predict
relevant locations of useful assertions and will improve code quality.
@inproceedings{ray2014uniqueness,
title={The Uniqueness of Changes: Characteristics and Applications},
author={Ray, Baishakhi and Nagappan, Meiyappan and Bird, Christian and Nagappan, Nachiappan and Zimmermann, Thomas},
series = {MSR '15},
year={2015},
organization={ACM}
}
Changes in software development come in many forms. Some changes are frequent, idiomatic, or
repetitive (e.g. adding checks for nulls or logging important values) while others are unique.
We hypothesize that unique changes are different from the more common similar (or non-unique)
changes in important ways; they may require more expertise or represent code that is more complex
or prone to mistakes. As such, these unique changes are worthy of study. In this paper, we present a
definition of unique changes and provide a method for identifying them in software project history.
Based on the results of applying our technique on the Linux kernel and two large projects at
Microsoft, we present an empirical study of unique changes. We explore how prevalent unique changes
are and investigate where they occur along the architecture of the project. We further investigate
developers’ contribution towards uniqueness of changes. We also describe potential applications of
leveraging the uniqueness of change and implement two of those applications, evaluating the risk of
changes based on uniqueness and providing change recommendations for non-unique changes.
Gender and Tenure Diversity in GitHub Teams,
10 pages, acceptance rate: 20%.
by Bogdan Vasilescu, Daryl Posnett, Baishakhi Ray, Mark van den Brand, Alexander Serebrenik, Premkumar Devanbu, Vladimir Filkov.
[
CHI 2015]
@inproceedings{bogdan2014Diversity,
title={Gender and Tenure Diversity in GitHub Teams},
author={Bogdan, Vasilescu and Posnett, Daryl and Ray, Baishakhi and Brand, Mark van den and Filkov and Serebrenik, Alexander and Premkumar, Devanbu and Filkov, Vladimir},
series = {CHI '15},
year={2015},
organization={ACM}
}
Using GitHub, we studied gender and tenure diversity in online
programming teams. Using the results of a survey and regression modeling of
GitHub data set comprising over 2 Million projects, we studied how diversity
relates to team productivity and turnover. We showed that both gender and
tenure diversity are positive and significant predictors of productivity. These
results can inform decision-making on all levels, leading to better outcomes in
recruiting and performance.
@inproceedings{ray2014lang,
title={A Large Scale Study of Programming Languages and Code Quality in Github},
author={Ray, Baishakhi and Posnett, Daryl and Filkov, Vladimir and Devanbu, Premkumar},
booktitle={Proceedings of the ACM SIGSOFT 22nd International Symposium on the
Foundations of Software Engineering},
series = {FSE '14},
year={2014},
organization={ACM}
}
To investigate whether a programming language is the right tool for the job, I gathered a
very large data set from GitHub (728 projects, 63M lines of code, 29K authors, 1.5M commits,
in 17 languages). Using a mixed-methods approach, combining multiple regression modeling with
visualization and text analytics, I studied the effect of language features such as static v.s.
dynamic typing, strong v.s. weak typing on software quality. By triangulating findings from
different methods, and controlling for confounding effects such as code size, project age, and
contributors, I observed that a language design choice does have a significant, but modest
effect on software quality.
@inproceedings{brubaker2014using,
title={Using Frankencerts for Automated Adversarial Testing of Certificate Validation
in SSL/TLS Implementations},
author={Brubaker, Chad and Jana, Suman and Ray, Baishakhi and Khurshid, Sarfraz and
Shmatikov, Vitaly},
booktitle={IEEE Symposium on Security and Privacy 2014},
year={2014},
organization={IEEE}
}
Nowadays in open software market, multiple software are available to users that provide
similar functionality. For example, there exists a pool of popular SSL/TLS libraries (e.g.,
OpenSSL, GnuTLS, NSS, CyaSSL, GnuTLS, PolarSSL, MatrixSSL, etc.) for securing network
connections from man-in-the-middle attacks. Certificate validation is a crucial part of
SSL/TLS connection setup. Though implemented differently, the certificate validation logic of
these different libraries should serve the same purpose, following the SSL/TLS protocol, i.e.
for a given certificate, all of the libraries should either accept or reject it. In
collaboration with security researchers at the University of Texas at Austin, we designed the
first large-scale framework for testing certificate validation logic in SSL/TLS
implementations. First, we generated millions of synthetic certificates by randomly mutating
parts of real certificates and thus induced unusual combinations of extensions and
constraints. A valid SSL implementation should be able to detect and reject the unusual
mutants. Next, using a differential testing framework, we checked whether one SSL/TLS
implementation accepts a certificate while another rejects the same certificate. We used such
discrepancies as an oracle for finding flaws in individual implementations. We uncovered 208
discrepancies between popular SSL/TLS implementations, many of them are caused by serious
security vulnerabilities.
@inproceedings{ray2013detecting,
title={Detecting and characterizing semantic inconsistencies in ported code},
author={Ray, Baishakhi and Kim, Miryung and Person, Suzette and Rungta, Neha},
booktitle={Automated Software Engineering (ASE), 2013 IEEE/ACM 28th International Conference on},
pages={367--377},
year={2013},
organization={IEEE}
}
In order to automatically detect copy-paste errors, I investigated: (1) What are the common
types of copy-paste errors? (2) How can they be automatically detected? By analyzing
the version histories of FreeBSD and Linux, I found five common types of copy-paste errors and
then leveraging this categorization I designed a two-stage analysis technique to detect and
characterize copy-paste errors. The first stage of the analysis, SPA, detects and categorizes
inconsistencies in repetitive changes based on a static control and data dependence analysis.
SPA successfully identifies copy-paste errors with 65% to 73% precision, an improvement by 14
to 17 percentage points with respect to previous tools. The second stage of the analysis,
SPA++, uses the inconsistencies computed by SPA to direct symbolic execution in order to
generate program behaviors that are impacted by the inconsistencies. SPA++ further compares
these program behaviors leveraging logical equivalence checking (implemented with z3 theorem
prover) and generates test inputs that exercise program paths containing the reported
inconsistencies. A case study shows that SPA++ can refine the results reported by SPA and help
developers analyze copy-paste inconsistencies. I collaborated with researchers from NASA for
this work.
@inproceedings{mcdonnell2013empirical,
title={An empirical study of API stability and adoption in the Android ecosystem},
author={McDonnell, Tyler and Ray, Baishakhi and Kim, Miryung},
booktitle={Software Maintenance (ICSM), 2013 29th IEEE International Conference on},
pages={70--79},
year={2013},
organization={IEEE}
}
In today’s software ecosystem, which is primarily governed by web, cloud, and mobile
technologies, APIs perform a key role to connect disparate software. Big players like
Google, FaceBook, Microsoft aggressively publish new APIs to accommodate new feature
requests, bugs fixes, and performance improvements. We investigated how such fast paced
API evolution affects the overall software ecosystem? Our study on Android API evolution
showed that the developers are hesitant to adopt fast evolving, unstable APIs. For
instance, while Android updates 115 APIs per month on average, clients adopt the new APIs
rather slowly, with a median lagging period of 16 months. Furthermore, client code with
new APIs is typically more defect prone than the ones without API adaptation. To the best
of my knowledge, this is the first work studying API adoption in a large software
ecosystem, and the study suggests how to promote API adoption and how to facilitate growth
of the overall ecosystems.
@inproceedings{Ray2012,
title = {A Case Study of Cross-system Porting in Forked Projects},
author = {Ray, Baishakhi and Kim, Miryung},
booktitle = {Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering},
series = {FSE 2012},
articleno = {53},
pages = {53:1--53:11}
}
This paper empirically demonstrates that developers spend a significant amount of time and
effort in introducing similar features and bug-fixes in and across different projects.
This involves a significant amount of repeated work. To automatically identify the
repetitive changes, I designed Repertoire, an source code change analysis tool that
compares the edit contents and the corresponding operations of program patches to identify
similar changes, with 94% precision and 84% recall. Using Repertoire, I showed that
developers often introduce a significant amount of repeated changes within and across
projects. Most notably, repetitive changes among forked projects (different variants of an
existing project, e.g., FreeBSD, NetBSD and OpenBSD) incur significant duplicate work. In
each BSD release, on average, more than twelve thousand lines are ported from peer
projects, and more than 25% of active developers participate in cross-system porting in
each release.
@inproceedings{ray2012repertoire,
title={Repertoire: A cross-system porting analysis tool for forked software projects},
author={Ray, Baishakhi and Wiley, Christopher and Kim, Miryung},
booktitle={Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering},
series = {FSE 2012},
articleno = {8},
pages = {8:1--8:4},
}
To create a new variant of an existing project, developers often copy an existing
codebase and modify it. This process is called software forking. After forking software,
developers often port new features or bug fixes from peer projects. Repertoire analyzes
repeated work of cross-system porting among forked projects. It takes the version
histories as input and identifies ported edits by comparing the content of individual
patches. It also shows users the extent of ported edits, where and when the ported edits
occurred, which developers ported code from peer projects, and how long it takes for
patches to be ported.
@inproceedings{park2012empirical,
title={An empirical study of supplementary bug fixes},
author={Park, Jihun and Kim, Miryung and Ray, Baishakhi and Bae, Doo Hwan},
booktitle={Mining Software Repositories (MSR), 2012 9th IEEE Working Conference on},
pages={40--49},
year={2012},
organization={IEEE}
}
A recent study finds that errors of omission are harder for programmers to detect than
errors of commission. While several change recommendation systems already exist to
prevent or reduce omission errors during software development, there have been very few
studies on why errors of omission occur in practice and how such errors could be
prevented. In order to understand the characteristics of omission errors, this paper
investigates a group of bugs that were fixed more than once in open source
projects—those bugs whose initial patches were later considered incomplete and to which
programmers applied supplementary patches.
@inproceedings{rossbach2011ptask,
title={PTask: Operating system abstractions to manage GPUs as compute devices},
author={Rossbach, Christopher J and Currey, Jon and Silberstein, Mark and Ray, Baishakhi and Witchel, Emmett},
shorthand = {SOSP'11},
booktitle={Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles},
pages={233--248},
year={2011},
organization={ACM}
}
GPUs are typically used for high-performance rendering or batch-oriented computations, but
not as general purpose compute-intensive tasks, such as brain-computer interfaces or file
system encryption. Current OS treats GPU as an I/O device as opposed to a general purpose
computational resource, like a CPU. To overcome this issue, we proposed PTask APIs, a new
set of OS abstractions. As part of this work, I ported EncFS, a FUSE based encrypted file
system for Linux, to CUDA framework such that it can use GPU for AES encryption and
decryption. Using PTask’s GPU scheduling mechanism, I showed that running EncFS on GPU over
CPU made a sequential read and write of a 200MB file 17% and 28% faster.
WhozThat?: Evolving an Ecosystem for Context-Aware Mobile Social Networks.
6 pages
by Aaron Beach, Mike Gartrell, Sirisha Akkala, Jack Elston, John Kelley, Keisuke Nishimoto, Baishakhi Ray,
Sergei Razgulin, Karthik Sundaresan, Bonnie Surendar, Michael Terada, Richard Han
IEEE Network Magazine Special Issue on Composable context aware services, 2008.
DOI