From an Unix Head Perspective
I would like to discuss the joys and sorrows of WordPress. WordPress is the tool that I use to manage my web site. It is a powerful tool and thus very useful.
BLOG
The final installment of this eight-part series about the Swift MySQL RESAR saga. In Part 7, we presented the performance results for Swift RESAR and the Stream Star Schema. In this post we will complete this series by summarizing the project and attempt to draw some conclusions.
In this project, we have described and demonstrated a powerful extension to Swift cloud storage: Swift RESAR. This facility greatly empowers Swift Administrators in managing large numbers of cloud devices. It also fully enables said administrators to employ mathematical models so that device reliability can be optimized.
The Swift MySQL RESAR project had less than stellar results on its iteration. Last time I presented a new approach to Swift RESAR: the Stream Star Schema. In this post we will present results for this new approach.
Once the Stream Star Schema was defined, the Swift RESAR database was also implemented as a Stream Star Schema. The following is the Stream Star Schema for RESAR:
MetaData Fact Table
Last time we theorized why the RESAR Swift MySQL approach has such dismal performance results. In this post we will discuss a new approach.
Specifically, we will define the stream star schema that will result in a database that is optimized for insertion performance. This performance improvement will primarily be the result of the improved attribute indexing mechanism. Many relational databases like MySQL use B-trees to implement attribute indexes. Index table insertion time thus depends on the table size and is O(log n) where n is the number of entries in a table. For the stream star schema, we proposed using Hash Tables instead of B-Trees.
My last blog post presented the performance results for RESAR Swift using MySQL. I will now analyze why this approach has such dismal performance results.
You will remember that for a cloud cluster of 1 million devices, database construction time was (on average) 0.12 seconds for a single device and reliability group. It required over 34 hours to create the entire database of 1 million devices and 1 million reliability groups. In addition, the minimum query time was 0.000272622203333 seconds and the maximum was 0.00045933403 seconds. We sincerely hoped that this database creation time was excessive and that the second RESAR Swift approach would greatly improve database performance.
This post is a continuation of the Swift RESAR saga. Last time I presented the design of RESAR Swift using MySQL. I would now like to present performance results for this implementation.
The following shows MySQL database construction times for a given number of disk devices. Each disk device was partitioned into 3 disklets. A Reliability Group was also created for each disk device. Each Reliability Group consisted of 3 disklets.
This continues the RESAR saga. Last time I explained the project design; now I’d like to focus on how RESAR was implemented in Swift.
During our research, we realized that there were essentially two approaches available. The first approach emphasized leveraging existing code in the Python community. Swift is implemented in the Python programming language. So the Swift RESAR project is also implemented in Python. So for the first approach, the primary goal was to minimize the amount of new code thus resulting in timely results. On the other hand, the second approach emphasized performance. We were willing to write new code as long as it resulted in better RESAR performance. This approach will be presented in a future blog.
Previously I presented some background on Swift. I’d like to continue by focusing on The RESAR Project. This project leverages previous work done by Ignacio Corderi on the subject of Cloud Device Management. He presented his research in a paper: “RESAR storage: a system for two-failure tolerant, self-adjusting million disk storage clusters”. This paper was co-authored by Dr. Darrell D. E. Long (a professor at the University of California, Santa Cruz), Dr. Thomas. M. Kroeger (of Sandia National Laboratories) and by Dr. Thomas Schwarz (a professor at Santa Clara University).
Previously I presented a new project that I am working on for my PhD: “Swift RESAR“. To better understand it, here’s some background on Swift. Swift is a Cloud Storage implementation, a free open source software released under the terms of the Apache License. This project is managed by the OpenStack Foundation, a non-profit corporation established in September 2012. It is gaining wide popularity in that over 150 companies are currently participating in this project.
I would like to describe the latest project that I am working on for my PhD thesis program. This project involves a number of participants, so I will first provide short biographies.
First off, we have my thesis adviser (at Santa Clara University) Dr. Ahmed Amer. Dr. Amer received his Doctorate in Computer Science from the University of California, Santa Cruz. He is now an Associate Professor at both UC Santa Cruz and Santa Clara University. Ignacio Corderi is one of Dr. Amer’s PhD students at UC Santa Cruz. Recently Ignacio was the lead author of a rather interesting paper: “RESAR Storage: a System for Two-Failure Tolerant, Self-Adjusting Million Disk Storage Clusters.” Additional authors of this paper are: