Bayesian Models Linking Web Site Structure and Usage

Case Study

Challenges

Despite its ubiquity, the World Wide Web is poorly understood. As a consequence, many sites are difficult to navigate, hard to use and have confusing structure, to the extent that users may be unable to find content and abandon the site. Essential needs are to relate user behavior to website structure; to compare site usage at different times, or for different classes of users; segmentation of sessions; quantification of inter-relationships among pages; and prediction of user behavior, including forecasts, for example, of the economic impact of promotional campaigns. The ultimate impact is more efficient websites that serve users more effectively

Outcomes & Results

NISS created a set of four increasingly complex, but scalable, Bayesian models that related the usage (specifically, user page transitions) of a website to its structure. They used real data from four qualitatively different websites, an E-commerce site, a site operated by a large financial institution, a content site and an information site. The team used various Bayesian models to test the structure and useage of each type of website.

By using the models, the team can can better understand how the website is being used by examining appropriate posterior probabilities, and designers could use these as a guide for improving the website design. The designers can more easily describe their notion of the design of the website, and their concept of desirable user behavior.


Research Project

A critical problem for users of the World Wide Web is that many sites are difficult to navigate, hard to use and have confusing structure. Users may become lost, and they may make large leaps within a website (for example,
returning unnecessarily to the home page) that are inconsistent with its structure.They may be unable to find content and abandon the site.It is also becoming harder to develop a good website.  A structure that seems intuitive to an author may be highly confusing to everyone else. One way to improve usability is to conduct formal user studies and measure user performance for specific tasks, but this is too expensive for all but a few sites. A second approach is to exploit the rich instrumentation in the on-line world, which is the approach NISS used for this study.

NISS created a set of four increasingly complex, but scalable, Bayesian models that relate the usage of a website to its structure. The team applied, validated and refined the models and used real data from four qualitatively different websites, an E-commerce site, a site operated by a large financial institution, a content site, and an information site. The models were scalable because the destinations from a given page are classes of pages that mirror the tree structure of the site, rather than individual pages. All four models assumed Dirichlet prior distributions for transitions from each page. The first three employed very aggregated classes of transitions, and differed according to whether the transition distributions and the priors were the same for all pages. The fourth model disaggregated the "child" and "sibling" destinations. Calculation of posterior distributions varied in difficulty: some were available in closed form, while others required intensive Markov chain Monte Carlo computation. In addition rigorous model assessment provided insight into what level of aggregation was appropriate to which analyses of Web data.

Project Goal: 

Figure out a way to develop a website that is easy to navigate for the users.

Research Team: 

Principal Investigator: Alan Karr, NISS

Post Doctoral Fellow(s): Murali Haran, Ashish Sanil

Individual Team Members: 
Alan F. KarrMurali HaranAshish Sanil

Funding Sponsors: