CAREER: Automated Storage Manager - An Operating System for Network I/O Devices

Elizabeth Varki
Department of Computer Science
University of New Hampshire
Kingsbury Hall
Durham, NH 03824

Phone: (603) 862-2319
Fax: (603) 862-3493
Email: varki@cs.unh.edu
www: www.cs.unh.edu/~varki

WWW Page
www.cs.unh.edu/~varki/me/career.html

Project Award Information

Keywords
storage data, storage management, storage area networks, network attached storage, Fibre channel, performance evaluation, disk arrays, performance models, caching techniques, prefetching, cache management.

Project Summary
Storage technology has moved from a system-bus interconnect to a network interconnect with the development of Network-Attached Storage (NAS) and Storage Area Network (SAN) technologies which allow storage devices to be directly connected to the network. However, storage management, an increasingly difficult job, still remains under the control of administrators and operating systems. This work proposes the development of an independent storage management system that controls all accesses to the storage devices, much like an operating system which controls all access to a computer system. The storage manager should be able to quickly evaluate the performance of the devices under its control for various system configurations under different workloads. In order to develop this software, performance models of real storage devices were developed. These models incorporate the impact of internal optimization algorithms implemented in the device controllers. These models are computationally simple and can be used to estimate the performance impact of various storage configurations under different workloads. The most effective way of speeding storage devices is to ensure that required data are already loaded in the cache from the disks when read I/O requests for this data arrive, and to ensure that there is sufficient cache space for all write I/O requests to be written to cache immediately. Storage caching techniques designed to extract the locality of reference of I/O workloads were developed to use storage caches efficiently. In addition, as part of the project work, scheduling techniques designed for storage devices on a network were developed.

Publications and Products

Work in Progress

Project Objectives
This project contributed to the educational experience of the graduate and undergraduate students who took operating systems and performance classes. Several graduate students worked on different aspects of this project work, and this project contributed to their educational development.

The objective of this project was to develop a storage manager. We developed several parts of the storage manager that are listed below.

In order for a storage manager to decide where to place data, the storage manager has to evaluate the performance under various configurations. In this regard, performance models of storage systems under different workloads was developed. Prior to this work, most performance models of storage devices only analyzed the impact of physical disk specifications. The impact of internal algorithms implemented by storage controllers was ignored. We developed models that incorporated the impact of internal controller algorithms.

The ultimate goal of the storage manager was to speed up the storage system. Storage caching policies play a key role in speeding up the devices. As part of the project, we designed and implemented storage caching techniques for I/O workloads.

Storage scheduling policies for large storage systems must deal with replication. We developed scheduling policies for storage units that have several replications in various location.

Area Background
Storage systems represent a growing market. In recent years there has been an explosion of applications (which include scientific ``grand-challenge'' programs, multi-media systems, and large transaction-based information systems) with varying performance needs that use enormous amounts of data. These applications have high Quality of Service (QoS) requirements from storage devices, irrespective of the location of the data and its users and the problems that could interfere with data access. It is very difficult to coordinate the storage, network, and computation resources required for these heterogeneous applications. A solution to this problem of storage management is to have the storage system manage its data and devices. In addition to academic labs, companies like HP, EMC, Veritas, and IBM and research labs like the NASD Lab in CMU are investigating the design and development of software for network attached storage.

Area References

Acknowledgement Of Support And Disclaimer

"This material is based upon work supported by the National Science Foundation under Grant No. 0093111. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation."