CorpusStudio:
Surfacing Emergent Patterns in a Corpus of Prior Work while Writing

Given a corpus of papers written for the same or similar audiences, e.g., papers previously published at ACM UIST, CorpusStudio writers by making visible the writing choices of previous authors in the corpus. To help the writer recognize common and uncommon paper structures, the left sidebar (A) shows an ordered distribution of clusters of section titles in the corpus using Positional Diction Clustering. Informed by (A), writers can draft their own outline in the center text editor (B). When fleshing out their outline with prose, to potentially see emergent patterns in previously written papers, writers can press TAB
to retrieve analogous sentence examples from the corpus (C) based on their cursor's location within their own draft. To reveal emerging patterns, the writer can select different modes of highlighting commonalities and variation across retrieved sentences. The writer can hover over a retrieved sentence (D) to see more of the context in which it appeared, and save, annotate, and share retrieved sentence examples that they think fulfill a purpose particularly well or poorly.