USENIX WebApps '10 Session Abstracts

CONFERENCE PROGRAM ABSTRACTS

Wednesday, June 23, 2010

10:30 a.m.–Noon

Separating Web Applications from User Data Storage with BSTORE
Back to Program
This paper presents BSTORE, a framework that allows developers to separate their web application code from user data storage. With BSTORE, storage providers implement a standard file system API, and applications access user data through that same API without having to worry about where the data might be stored. A file system manager allows the user and applications to combine multiple file systems into a single namespace, and to control what data each application can access. One key idea in BSTORE's design is the use of tags on files, which allows applications both to organize data in different ways, and to delegate fine-grained access to other applications. We have implemented a prototype of BSTORE in Javascript that runs in unmodified Firefox and Chrome browsers. We also implemented three file systems and ported three different applications to BSTORE. Our prototype incurs an acceptable performance overhead of less than 5% on a 10Mbps network connection, and porting existing client-side applications to BSTORE required small amounts of source code changes.

AjaxTracker: Active Measurement System for High-Fidelity Characterization of AJAX Applications
Back to Program
Cloud-based Web applications powered by new technologies such as Asynchronous Javascript and XML (Ajax) place a significant burden on network operators and enterprises to effectively manage traffic. Despite increase of their popularity, we have little understanding of characteristics of these cloud applications. Part of the problem is that there exists no systematic way to generate their workloads, observe their network behavior today and keep track of the changing trends of these applications. This paper focuses on addressing these issues by developing a tool, called AJAXTRACKER, that automatically mimics a human interaction with a cloud application and collects associated network traces. These traces can further be post-processed to understand various characteristics of these applications and those characteristics can be fed into a classifier to identify new traffic for a particular application in a passive trace. The tool also can be used by service providers to automatically generate relevant workloads to monitor and test specific applications.

JSMeter: Comparing the Behavior of JavaScript Benchmarks with Real Web Applications
Back to Program
JavaScript is widely used in web-based applications and is increasingly popular with developers. So-called browser wars in recent years have focused on JavaScript performance, specifically claiming comparative results based on benchmark suites such as SunSpider and V8. In this paper we evaluate the behavior of JavaScript web applications from commercial web sites and compare this behavior with the benchmarks. We measure two specific areas of JavaScript runtime behavior: 1) functions and code and 2) events and handlers. We find that the benchmarks are not representative of many real web sites and that conclusions reached from measuring the benchmarks may be misleading. Specific common behaviors of real web sites that are underemphasized in the benchmarks include event-driven execution, instruction mix similarity, cold-code dominance, and the prevalence of short functions. We hope our results will convince the JavaScript community to develop and adopt benchmarks that are more representative of real web applications.

1:30 p.m.–3:00 p.m.

JSZap: Compressing JavaScript Code
Back to Program
JavaScript is widely used in web-based applications, and gigabytes of JavaScript code are transmitted over the Internet every day. Current efforts to compress JavaScript to reduce network delays and server bandwidth requirements rely on syntactic changes to the source code and content encoding using gzip. This paper considers reducing the JavaScript source to a compressed abstract syntax tree (AST) and transmitting it in this format. An AST-based representation has a number of benefits including reducing parsing time on the client, fast checking for well-formedness, and, as we show, compression. With JSZAP, we transform the JavaScript source into three streams: AST production rules, identifiers, and literals, each of which is compressed independently. While previous work has compressed Java programs using ASTs for network transmission, no prior work has applied and evaluated these techniques for JavaScript source code, despite the fact that it is by far the most commonly transmitted program representation. We show that in JavaScript the literals and identifiers constitute the majority of the total file size and we describe techniques that compress each stream effectively. On average, compared to gzip we reduce the production, identifier, and literal streams by 30%, 12%, and 4%, respectively. Overall, we reduce total file size by 10% compared to gzip while, at the same time, benefiting the client by moving some of the necessary processing to the server.

Leveraging Cognitive Factors in Securing WWW with CAPTCHA
Back to Program
Human Interactive Proofs systems using CAPTCHA help protect services on the World Wide Web (WWW) from widespread abuse by verifying that a human, not an automated program, is making a request. To authenticate a user as human, a test must be passable by virtually all humans, but not by computer programs. For a CAPTCHA to be useful online, it must be easy to interpret by humans. In this paper, we present a new method to combine handwritten CAPTCHAs with a random tree structure and random test questions to create a novel and more robust implementation that leverages unique features of human cognition, including the superior ability over machines in recognizing graphics and reading unconstrained handwriting text that has been transformed in precise ways. This combined CAPTCHA protects against advances in recognition systems to ensure it remains viable in the future without causing additional difficulties for humans. We present motivation for our approach, algorithm development, and experimental results that support our CAPTCHA in protecting web services while providing important insights into human cognitive factors at play during human-computer interaction.

GULFSTREAM: Staged Static Analysis for Streaming JavaScript Applications
Back to Program
The advent of Web 2.0 has led to the proliferation of client-side code that is typically written in JavaScript. Recently, there has been an upsurge of interest in static analysis of client-side JavaScript for applications such as bug finding and optimization. However, most approaches in static analysis literature assume that the entire program is available to analysis. This, however, is in direct contradiction with the nature of Web 2.0 programs that are essentially being streamed at the user's browser. Users can see data being streamed to pages in the form of page updates, but the same thing can be done with code, essentially delaying the downloading of code until it is needed. In essence, the entire program is never completely available. Interacting with the application causes more code to be sent to the browser. This paper explores staged static analysis as a way to analyze streaming JavaScript programs. We observe while there is variance in terms of the code that gets sent to the client, much of the code of a typical JavaScript application can be determined statically. As a result, we advocate the use of combined offline-online static analysis as a way to accomplish fast, browser-based client-side online analysis at the expense of a more thorough and costly server-based offline analysis on the static code. We find that in normal use, where updates to the code are small, we can update static analysis results quickly enough in the browser to be acceptable for everyday use. We demonstrate the staged analysis approach to be advantageous especially in mobile devices, by experimenting on popular applications such as Facebook.

Thursday, June 24, 2010

10:30 a.m.–Noon

Managing State for Ajax-Driven Web Components
Back to Program
Ajax-driven Web applications require state to be maintained across a series of server requests related to a single Web page. This conflicts with the stateless approach used in most Web servers and makes it difficult to create modular components that use Ajax. We implemented and evaluated two approaches to managing component state: one, called reminders, stores the state on the browser, and another, called page properties, stores the state on the server. Both of these approaches enable modular Ajax-driven components but they both introduce overhead for managing the state; in addition the reminder approach creates security issues and the page property approach introduces storage reclamation problems. Because of the subtlety and severity of the security issues with the reminder approach, we argue that it is better to store Ajax state on the server.

SVC: Selector-based View Composition for Web Frameworks
Back to Program
We present Selector-based View Composition (SVC), a new programming style for web application development. Using SVC, a developer defines a web page as a series of transformations on an initial state. Each transformation consists of a selector (used to select parts of the page) and an action (used to modify content matched by the selector). SVC applies these transformations on either the client or the server to generate the complete web page. Developers gain two advantages by programming in this style. First, SVC can automatically add Ajax support to sites, allowing a developer to write interactive web applications without writing any JavaScript. Second, the developer can reason about the structure of the page and write code to exploit that structure, increasing the power and reducing the complexity of code that manipulates the page's content. We introduce SVC as a stateless, framework-agnostic development style. We also describe the design, implementation and evaluation of a prototype, consisting of a PHP extension using the WebKit browser engine [37] and a plugin to the popular PHP MVC framework Code Igniter [8]. To illustrate the general usefulness of SVC, we describe the implementation of three example applications consisting of common Ajax patterns. Finally, we describe the implementation of three post-processing filters and compare them to currently existing solutions.

Silo: Exploiting JavaScript and DOM Storage for Faster Page Loads
Back to Program
A modern web page contains many objects, and fetching these objects requires many network round trips—establishing each HTTP connection requires a TCP handshake, and each HTTP request/response pair requires at least one round trip. To decrease a page's load time, designers try to minimize the number of HTTP requests needed to fetch the constituent objects. A common strategy is to inline the page's JavaScript and CSS files instead of using external links (and thus separate HTTP fetches). Unfortunately, browsers only cache externally named objects, so inlining trades fewer HTTP requests now for greater bandwidth consumption later if a user revisits a page and must refetch uncacheable files. Our new system, called Silo, leverages JavaScript and DOM storage to reduce both the number of HTTP requests and the bandwidth required to construct a page. DOM storage allows a web page to maintain a key-value database on a client machine. A Silo-enabled page uses this local storage as an LBFS-style chunkstore. When a browser requests a Silo-enabled page, the server returns a small JavaScript shim which sends the ids of locally available chunks to the server. The server responds with a list of the chunks in the inlined page, and the raw data for chunks missing on the client. Like standard inlining, Silo reduces the number of HTTP requests; however, it facilitates finer-grained caching, since each chunk corresponds to a small piece of JavaScript or CSS. The client-side portion of Silo is written in standard JavaScript, so it runs on unmodified browsers and does not require users to install special plugins.

1:30 p.m.–3:00 p.m.

Pixaxe: A Declarative, Client-Focused Web Application Framework
Back to Program
This paper provides a brief introduction to and overview of the Pixaxe Web Application Framework ("Pixaxe"). Pixaxe is a framework with several novel features, including a transparent template system that runs entirely within the web browser, an emphasis on developing rich internet applications as simple web pages, and pushing as much logic and rendering overhead to the client as possible. This paper also introduces several underlying technologies of Pixaxe, each of which can be used separately: Jenner, a completely client-side template engine; Esel, a powerful expression and query language; and Kouprey, a parser combinator library for ECMAScript.

Featherweight Firefox: Formalizing the Core of a Web Browser
Back to Program
We offer a formal specification of the core functionality of a web browser in the form of a small-step operational semantics. The specification accurately models the asynchronous nature of web browsers and covers the basic aspects of windows, DOM trees, cookies, HTTP requests and responses, user input, and a minimal scripting language with first-class functions, dynamic evaluation, and AJAX requests. No security enforcement mechanisms are included—instead, the model is intended to serve as a basis for formalizing and experimenting with different security policies and mechanisms. We survey the most interesting design choices and discuss how our model relates to real web browsers.

DBTaint: Cross-Application Information Flow Tracking via Databases
Back to Program
Information flow tracking has been an effective approach for identifying malicious input and detecting software vulnerabilities. However, most current schemes can only track data within a single application. This single-application approach means that the program must consider data from other programs as either all tainted or all untainted, inevitably causing false positives or false negatives. These schemes are insufficient for most Web services because these services include multiple applications, such as a Web application and a database application. Although system-wide information flow tracking is available, these approaches are expensive and overkill for tracking data between Web applications and databases because they fail to take advantage of database semantics. We have designed DBTaint, which provides information flow tracking in databases to enable cross-application information flow tracking. In DBTaint, we extend database datatypes to maintain and propagate taint bits on each value. We integrate Web application and database taint tracking engines by modifying the database interface, providing cross-application information flow tracking transparently to the Web application. We present two prototype implementations for Perl and Java Web services, and evaluate their effectiveness on two real-world Web applications, an enterprise-grade application written in Perl and a robust forum application written in Java. By taking advantage of the semantics of database operations, DBTaint has low overhead: our unoptimized prototype incurs less than 10–15% overhead in our benchmarks.

3:30 p.m.–4:30 p.m.

xJS: Practical XSS Prevention for Web Application Development
Back to Program
We present xJS, a practical framework for preventing code-injections in the web environment and thus assisting for the development of XSS-free web applications. xJS aims on being fast, developer-friendly and providing backwards compatibility. We implement and evaluate our solution in three leading web browsers and in the Apache web server. We show that our framework can successfully prevent all 1,380 real-world attacks that were collected from a well-known XSS attack repository. Furthermore, our framework imposes negligible computational overhead in both the server and the client side, and has no negative side-effects in the overall user's browsing experience.

SeerSuite: Developing a Scalable and Reliable Application Framework for Building Digital Libraries by Crawling the Web
Back to Program
SeerSuite is a framework for scientific and academic digital libraries and search engines built by crawling scientific and academic documents from the web with a focus on providing reliable, robust services. In addition to full text indexing, SeerSuite supports autonomous citation indexing and automatically links references in research articles to facilitate navigation, analysis and evaluation. SeerSuite enables access to extensive document, citation, and author metadata by automatically extracting, storing and indexing metadata. SeerSuite also supports MyCiteSeer, a personal portal that allows users to monitor documents, store user queries, build document portfolios, and interact with the document metadata. We describe the design of SeerSuite and the deployment and usage of CiteSeer^x as an instance of SeerSuite.

Need help? Use our Contacts page.

Last changed: 25 June 2010 jp