BITlab: Behavior Information Technology

BITlab
404 Wilson Rd. Room 251
Communication Arts & Sciences
Michigan State University
East Lansing, MI 48824

Six BITLab REU Students have posters at Mid-SURE

By: Rick Wash

The annual summer Mid-SURE (Mid-Michigan Symposium for Undergraduate Research Experiences) event will be held on July 23, 1:00-4:00 PM at the Breslin Center.

The BITLab will be represented by 6 student posters, featuring projects these students have worked on during their 10-week NSF REU (Research Experiences for Undergraduates) Internships with us.

Title: The Content Network
Authors: Cody Baker (University of Evansville), Dr. Emilee Rader
Project: Algorithmic Curation

The massive rise in popularity of social networking sites has exacerbated a common problem in the communication sciences: information overload. To combat this issue, the designers of certain social networking sites have recently begun to utilize methods of algorithmic curation. These methods are analogous to recommender systems that present users with items most likely to be enjoyed, such as Netflix’s movie recommendation. The critical difference between such a system and algorithmic curation is that curation will purposefully deny the user certain information, rather than simply ‘sorting’ all existing information according to likelihood of consumption. A prime example of such behavior is Facebook’s News Feed algorithm. Though the exact details of such a system are a well-kept secret, it is known via prior studies that the News Feed will often choose to not display posts from certain individuals. The goal of such a behavior is to maximize user interest according to the popularity of content. Though these methods are common, very little is known regarding the large-scale implications of such limiting effects on the diffusion of information. In seeking to understand these effects, we built a detailed computer simulation of a social network, and conducted experiments on various parameters of the system. Through this, we uncovered curious details regarding the structure of an individual’s content network; the sub-set of their typical friend network that is regularly informed of the individual’s activity. This knowledge aids us in predicting the consequences of withholding information and the subsequent influence on diffusion through the network.

Title: Understanding Expectations of Online Communities
Authors: Stephanie Peña (University of Michigan), Dr. Richard Wash
Project: Online Communities

Abstract: Online communities consist of large groups of people that aggregate online to work, socialize and communicate at a tremendous scale that would not be possible without the use of the internet. In these communities, like Facebook, Wikipedia and Reddit, potential users develop expectations and weigh the benefits of their participation. Through our research study, we aim to develop a better understanding of why and how people choose to participate in online communities. Our research goal is to further understand the expectations people form as they are introduced to new online communities. We administered qualitative interviews in which 50 subjects were exposed to Reddit or Quora for the first time. We questioned the subjects on any previous knowledge of the online community, had subjects complete a ‘think-aloud’ as they visited the community, and questioned subjects on their future expected participation in the community. Interview transcripts were then transcribed, cleaned, and coded. Since our research project is in its first year of operation, we are still in the process of analyzing the qualitative data we have collected.

Title: Interactions between Security Interfaces, Learning, and Sensitivity of Information
Authors: Lezlie España (Wisconsin Lutheran College), Dr. Rick Wash, Dr. Kami Vaniea, Dr. Emilee Rader
Project: Security

From locking the car to logging in to a favorite social network site, people interact with security on a daily basis. When people lock their keys in their car, they learn something and adjust their behavior accordingly. Applying this idea to internet security, our research looks for ways in which people learn about what information should be kept secure. Our goal is to see if people learn about online security through their interactions with security interfaces. Do people learn from having to log in to see information on websites? How does this learning affect their security behavior? We developed an experimental website where we manipulated the ways in which participants interacted with information and the login process. By having participants log in to see information that was hidden and then allowing them to choose privacy settings based on the information they saw, we hope to demonstrate that people can learn about online security. A better understanding of how people learn through these interfaces can help us further explain and encourage secure behavior. This poster will present the first step in understanding what people learn about sensitive information and how that learning is affected by the security interfaces involved.

Title: URL-based LDA for Prefetching
Authors: Nathan Klein (Oberlin College), Dr. Kami Vaniea, Dr. Rick Wash, Dr. Emilee Rader
Project: Security

Web prefetching is a technique used to reduce latency on the web by preloading webpages a user is likely to visit in the future. In this study we introduce and evaluate the effectiveness an approach we call URL-based LDA prefetching, or UBLP, on improving prefetching methods. UBLP is an integrative model which treats web sessions as documents of URLs split by separation characters such as “\”, a notion introduced by Wan et al. Using latent Dirichlet allocation, UBLP leverages Hellinger distance to measure the similarity between user sessions in a combined Markov model and weighted K-Nearest-Neighbors prefetching approach. Using a top-k prefetching scheme, we evaluate our model on two client-side datasets and two server-side datasets. We find that UBLP improves prefetching accuracy over All- Kth-Order Markov models, particularly on the client-side datasets. Utilizing its flexible topic-based prediction, our model is able to overcome the upper bound that sequence-based machine learning algorithms face due to their reliance on exact state matches. In addition, our model is fully up- dateable, since LDA is online, and UBLP is based on up- dateable machine learning algorithms. In an effort to allow for better comparison between prefetching research efforts in the future, we describe our preprocessing methods in detail, as well as provide a publicly available repository of our preprocessed data.

Title: What Does Computer Security Cost You?
Authors: Shiwani Bisht (Cornell University), Nathan Klein, Dr. Kami Vaniea, Dr. Rick Wash, Dr. Emilee Rader
Project: Security

Computer users who are tired of responding to security dialogs, updating software, and running system scans often complain about the amount of time they spend interacting with security-related tasks on their computers. These users may change default security settings to avoid these activities, but at what cost? In this study, we investigate how much time the average computer user spends interacting with computer security using data collected from Microsoft Windows 7 and 8 users over a period of a week. After collecting data from users, we found the average number of times that a user performs security-related activities per week, and estimated the time spent on these interactions based on previous work. We discuss the challenges of identifying, through computer logs, when users are interacting with security versus when the system is performing security-related tasks on the user’s behalf. Security professionals continuously tell users how important it is to practice good security behaviors, but rarely consider the amount of time practicing good security takes.

Title: Computer security information in stories, news articles, and education documents
Authors: Katie Hoban (Michigan State University), Dr. Emilee Rader, Dr, Rick Wash, Dr. Kami Vaniea
Project: Security

Despite the large amount of computer security information available to them, end users are often thought of as the weakest link in computer security. The information they have access to comes in many formats: news articles, news broadcasts, education documents, books, stories, and many more. However, inefficient or inconsistent communication between content providers and content consumers may result in knowledge gaps for the consumers. The quality and attributes of the computer security information available to users impact their ability to learn from them. To better understand this state of affairs, we collected news articles, education documents, and a survey about stories end-users had heard about computer security. We then analyzed the trends present across all three datasets. We found that there are serious mismatches between these datasets concerning the topics of hackers, viruses + malware, and phishing + spam; discrepancies that may impact the communication of computer security information to end-users.