Murat Kantarcioglu, Ph.D., assistant professor in the Department of Computer Science and director of the Data Security and Privacy Laboratory at the University of Texas at Dallas, talks to Tanuja Koppal, Ph.D., contributing editor at Lab Manager Magazine, about the new technologies he is creating for mining and sharing various types of data without compromising the security or privacy of the information. He offers much-needed advice on how lab managers should go about setting up some basic protocols and appropriate levels of security for protecting the data in their labs.
Q: What needs to be done to set up adequate data security in a lab? Where do you begin?
A: There are many tools out there to help secure digital data in a laboratory. The first line of defense is setting up some firewalls to limit remote access to machines and servers that have the data. The second line of defense is to have fine-grained access control of various types of data to determine who has access to what. If people need access to all the data that is available, then you can have accountability in place by creating a secure log-in mechanism to keep track of who accesses what, and if there is a misuse or abuse you can use the system logs to sort out any issues; this is the third line of defense. The fourth is the physical security of the systems, to avoid any thefts or loss of hardware. This involves having a card-access to the server locations and to the labs. What I am seeing is that old access continues even after people have left a lab. Physical security certainly should not be overlooked.
Q: How do you deal with securing really critical data?
A: For data that is very critical, we have to do some basic risk management. For instance, perform some ‘thought experiments’ to figure out what will happen if the data is leaked to the outside world. If the result is severe, then you should be more cautious and look into putting more controls in place, such as denying access to the Internet, disabling all USB ports, and minimizing unnecessary programs and operating system functionalities. For some kinds of data, I think the best solution is to limit its exposure. The more software programs you have on your machine, the more vulnerable it is to bugs [malicious codes]. These are the basic steps that you need to follow to protect and secure your data. To implement these measures, your machines should be uploaded with the latest security patches. You realize the need for security once you have lost it. It’s like air. You realize its importance once you start suffocating.
Q: Are there any resources that can help lab managers budget for some of these security initiatives?
A: Usually, the people in the information technology (IT) and security departments in most companies and organizations are wellinformed; asking them for help could be a good starting point. Also the tools for setting up firewalls and database encryption may already be available in the IT department and you may have the license to use the tools you need. It’s important to always start as early as possible with data security in mind, as it gets harder to put things in place later. At the same time, it’s never too late to start thinking about security. There are always things you can do to reduce the risks.
Q: How do you rate the commercially available tools, and can they be customized for your needs?
A: The main problem with commercially available tools is that there are many limitations with security when it comes to sharing data. The other issue is that some of the systems out there are too big and have many software bugs. Hence, as a last line of defense, we store our data in an encrypted data format and are developing tools to protect ourselves from some of those vulnerabilities. There is no absolute protection. It’s about reducing the risk to an acceptable level to manage a good research environment. Many start-ups and research groups are also putting out new tools. Over time there will be many more tools available to users and many of them will be open source.
Q: Is there any sharing of ideas across various fields when it comes to data security?
A: The tools that we are working on in my lab are supported by the work of and grants from the U.S. Air Force. The defense industry leads the way in this field, and we are trying to apply some of those same tools to the health care domain. However, there are some key differences. For instance, securely sharing data with coalition partners on different missions is critical in defense, but sometimes individual privacy is not an issue. However, in health care, individual privacy is a big issue. So there are differences, but there is a possibility for inspiration.
Q: What is your goal as the director of the Data Security and Privacy Lab?
A: Our mission is to install, share, mine, and learn from any kind of data without worrying about security and privacy issues. We are looking at this from an interdisciplinary perspective by integrating ideas from computer science with those from data mining, database and risk management, and also from economics. Our research is supported by grants from diverse sources, such as the National Science Foundation, Air Force Office of Scientific Research, Office of Naval Research, National Security Agency, and National Institutes of Health, and we are looking to come up with software tools to disseminate our research to the public. There are many labs across the country working on computer security, but what we are focusing on is the security of the data itself.
Murat Kantarcioglu, Ph.D., is an assistant professor in the Department of Computer Science and the director of the Data Security and Privacy Laboratory at the University of Texas at Dallas. His research focuses on creating technologies that can efficiently extract useful information from any data without sacrificing privacy or security. Recently, he has been working on security and privacy issues raised by data mining, privacy issues in social networks, privacy issues in health care, and risk and incentive issues in assured information sharing. His lab has created open source tools, such as the “Anonymization Tool Box,” that people can download and use to sanitize data and share it in a secure fashion. The tool box has compiled the best tools made available by various sources and put them together in one convenient location for public use. Kantarcioglu obtained his master’s degree and Ph.D. in computer science from Purdue University.