RESEARCH

Hancock

Hancock is a C-based domain-specific language designed to make it easy to read, write, and maintain programs that manipulate large amounts of relatively uniform data to build signatures. The motivation for Hancock was facilitating fraud detection by understanding the normal behavior of AT&T customers.  Because Hancock is embedded in C, it inherits all the functionality of C. Valid C programs are also valid Hancock programs, and Hancock programs can use libraries written for C. But Hancock is more than C. In addition to C constructs, Hancock provides domain-specific forms to facilitate large-scale data processing. For a given data-processing task, Hancock may be suitable if:

  • The task requires a small number of linear passes over a relatively uniform data source.
  • The task requires storing persistent information.

Papers

  • C. Cortes, K. Fisher, D. Pregibon, A. Rogers, and F. Smith. Hancock: A language for analyzing transactional data streams. In M. Garofalakis, J. Gehrke, and R. Rastogi, editors, Data Stream Management: Processing High-Speed Data Streams, chapter Part 4, Chapter 4, pages 387–408. Springer, 2016.
  • Hancock: A Language for Analyzing Transactional Data Streams, K.Fisher, K.Hogstedt, A.Rogers, and F.Smith. In ACM Transactions on Programming Languages and Systems 26, 2 (March 2004), 263-300.
  • An Application-Specific Database, K.Fisher, C.Goodall, K.Hogstedt, and A.Rogers. In Proceedings of the Eighth Biennial Workshop on Data Bases and Programming Languages, 2001.
  • Hancock: A Language for Extracting Signatures from Data Streams, C.Cortes, K.Fisher, D.Pregibon, A.Rogers, and F.Smith. In Proceedings of the Sixth International Conference on Knowledge Discovery and Data Mining, 2000, pages 9-17.
  • Hancock: A Language for Processing Very Large-Scale Data, D.Bonachea, K.Fisher, A.Rogers, and F.Smith. In USENIX 2nd Conference on Domain-Specific Languages, 1999, pages 163-176.