223 pages, published in 2017
The first time I got involved in a rigorous problem-solving effort related to system resilience and strategic engineering was during my internship at Intel, Santa Clara. As a Ph.D. student, my main concern at school was to build functional chips, chips that simply work as I powered them up. But at Intel Skylake Server CPU group, for the first time, I was facing questions about the reliability of the server chip, the back-end platform of cloud computing and big data management, in an uncertain, perhaps distant future. The following questions were the main considerations:
• How can we foresee the failures and avoid (or delay) them during the design stage?
• What type of failure is more likely to happen in a specific block of the system?
• What are Mean Time to Failure (MTTF) and Mean Time Between Failures
• How can we maintain and dynamically correct our system while it is running?
• How can we expand and scale the system with new software and hardware
features without jeopardising reliability, sustainability and security?
The short exposure to strategic engineering had a long lasting impact on my approach toward engineering in general and integrated circuits and systems design in particular. Later that autumn, when I returned to my tiny damp cubicle at building 38 of MIT to continue working on nano-relay based digital circuits, my concern was no longer merely the functionality of my systems right out of the box. The durability, scalability, resilience and sustainability of the system started to play an important role in my design strategies and decisions.
In the new age of global interconnectivity, big data and cloud computing, this book provides a great introduction to the flourishing research field of strategic engineering for cloud computing and big data analytics. It encompasses quite a few interesting topics in this multidisciplinary research area and tries to address critical questions about systems lifecycle, maintenance strategies for deteriorating systems, integrated design with multiple interacting subsystems, systems modelling and
analysis for cloud computing, software reliability and maintenance, cloud security and strategic approach to cloud computing.
While many questions about the future of big data in the next 20 years are unanswered today, a good insight into the computational system modelling, maintenance strategies, fault tolerance, dynamic evaluation and correction and cloud security would definitely pave the way for a better understanding of the complexity of the field and an educated prediction of its future.
Dr. Hossein Fariborzi Assistant Professor King Abdullah University of Science and Technology Saudi Arabia
Amin Hosseinian-Far, Muthu Ramachandran, DilshadSarwa