Alex J. DeWitt and Jasna Kuljis: A Usability Study of Polaris

July 13, 2006 by Ping

Read the paper here.

The Polaris software is described in a technical report at the HP website.  It isolates applications in separate user accounts to reduce the damage that can be done by viruses and trojans.

The authors conducted a usability study measuring effectiveness, efficiency, and user satisfaction, and asked users to complete eight tasks.  The documentation was included in the evaluation.

There were a lot of errors — an average of 59 errors per user in the experiment.  Because the software was an alpha release, there were bugs (some of which have since been fixed).

It was difficult for users to know whether Polaris was protecting a given application or not (70% success rate).  Only 4 out of 10 users were able to customize Outlook for Polaris, and none were able to Polarize multiple instances of a single application.

Despite being well aware of the risks of using the Internet, many users chose to open downloaded applications unsafely just because it was easier and quicker.  Some of them went on to then open the application again safely, as if that would help.  In post-task interviews, many users felt their data wasn’t worth protecting.  Perhaps educating users as to why their data is important would be more effective than training them to use security software.

The authors recommend that Polaris automatically create a new confined area every time an application starts and wipe it clean when the application exits, so the users doesn’t have to decide whether or not to create multiple instances.  (Having been involved with the project, i can say that one reason we didn’t do this was that the Polarization process involved changing some Windows security settings, and that took too long to do on every application launch.  I agree that it would have been nice to be able to, though.)

(By the way: I did not invent the Principle of Least Authority, as the first slide suggested.  :) I do advocate it, though.)

In response to a question, it sounds like the users weren’t personally at risk since they were using the experimenters’ computers. They were told to treat the login as though it were their own, but it’s probably still a significant factor in why they didn’t care.

 

It seems that while this evaluation was particular to Polaris, the conclusions were basically parallel with the general results arrived at when evaluating security software. It seems like in study after study we are reaching the same basic conclusions, to the point that much of the outcome could be guessed beforehand.

We seem to be developing a keen ability to review software and point out the HCISEC flaws, but it also seems that more often than not we are leery of positing suggestions for improvement. What can we do to encourage more studies to explore alternatives and provide solid direction?

Also, how many people are needed for a usability study to generate useful results? When I get into discussions about this the conventional wisdom seems to suggest twenty, thirty, forty, or more people are necessary to achieve reliable results…and yet it seems many accepted papers (not just here, but at CHI and elsewhere) have far fewer (this study had ten). How much of a problem is this?

 

Ping — I totally agree about the artificiality of the experiment. Who cares if you release someone else’s bank account information? But can you imagine trying to get IRB for an experiment where people log into their own bank accounts?

Richard — For qualitative studies, five users is often enough to catch the biggest problems with the software. For statistical significance or if you have many categories of different users, you’d need a lot more.