Intersect exists to increase research productivity through advanced technologies like supercomputing, software engineering, and artificial intelligence training. Operated by 14 universities and large research organisations, it is a major enabler of the National Collaborative Research Infrastructure Strategy (NCRIS) and had been competitively selected as a node of the Australian Research Cloud. Although Intersect had allocated over $12M in capital funding to deliver research computing and storage as-a- service, in 2014 it had yet to break ground on its compute cloud, or to offer a repeatable or sustainable storage cloud. The funding window was closing, with capital procurement contractually required to be completed by 31 December. If this deadline was not met substantial proportion of pre- committed NCRIS funding would expire and have to be returned.
There were four strategic challenges in 2014Q3;
Retention of NCRIS funding within three months,
Compute cloud creation within six months,
Data-centre renegotiation within nine months,
Storage cloud remediation/replacement within twelve months.
Intersect lacked capacity, expertise and leadership to achieve even one of these critical milestones. Each was subject to rigid predetermined time, budget and contractual constraints and, even had they been negotiable, the company’s financial position required that they be resolved faster if possible.
What Made This Project Unique?
Because the NeCTAR Cloud was already established and running, we had to adapt to their past design and architectural decisions. With the rapidly evolving pace of ICT hardware, we also had to future-proof any further decisions, while still maintaining compatibility with their current-state.
We had to understand the storage and computational needs of leading scientists, getting to know a breadth of different fields and types of research.
Next we had to translate those needs into the technical implications for the physical architecture we would need to build.
And finally we had to concisely summarise both ends of that spectrum and relate it to the business requirements of the myriad of stakeholders, which included almost every major research institution and university in Australia.
Challenge #1: Size
The ‘Big Bang Project’ in Switzerland is a direct example of one of the research cases they had to accommodate. Every time two electrons collided with one another they would generate one trillion data files. Just storing that sheer volume of data presents its own unique problems, knowing that if even one file is corrupted the entire dataset will crash. The speed at which that data is generated, and the need to preserve the relationships between data points, necessitates an extremely fast connection and data storage.
The key solution was having multiple tiers of data storage.
First Intersect needed storage that was fast enough to keep up with the processing demands of a supercomputer, this meant an all-flash storage solution. Data would migrate down the tiers depending on the accessibility needs of its users. The middle tiers were fast and slow disk storage types, and the final destination was Spectra Logic tape. Once on tape it was 'archived’ and could then be shared freely with other researchers. This hierarchical system gave users the extremely fast solid state storage type they needed, but saved substantial costs using disk storage and tape to hold lower tiers of data (four tiers in total).
Challenge #2: Sharing
Add to that the ability to now share that data freely with other scientists around the world, often sharing datasets of highly confidential information. Some researchers would want their data stored locally on our tapes, while others would want copies sent to them internationally. This creates a complex system of data being constantly moved between data tiers on-demand.
Challenge #3: Physical
The initial construction of the data centre posed its own set of problems: power management, heat management, and physical access all had to be resolved. On top of this the running costs had to be minimised to satisfy a tight budget.
Everything from the foundation type, spacing, cooling, hot and cold aisle distribution, to the electricity demands. All had to be accurately designed and calculated before construction could begin.
Challenge #4: Security
The next pressing concern was the security required for most of the research data. A considerable portion of the data was classified – whether it was intellectual property, medical information, or particularly sensitive.
Access to the Australian Academic and Research network (AARnet) was authenticated via the Australian Access Federation
Challenge #5: Processing
As the project for Intersect developed, storage wasn't the only service they wanted to deliver via the cloud. Soon the possibility of supercomputing power, accessible to researchers across Australia, came to the table. It would become the most powerful supercomputer in the Southern Hemisphere.
We added 4000 cores to the computing resource of 34,000 cores across Australia.
This processing module had to be connected via fibre optic cables to the newly built data storage solution.
Challenge #6: Project Methodology
The “problem” was changing almost every week – it required ever-evolving solutions to ever-evolving questions. As each university and each team of researchers came on board they would all need their own type of connection. Their own storage needs. Their own processing needs. This sounds like a familiar beast that had already been tamed with Agile methodology, however the nature of hardware implementations required a unique twist on the typically software-centric approach to project management.
"Through Fred and his team’s delivery professionalism and ability to grasp extremely complex computing technologies all four goals were achieved on budget, on time, and met all funding body acceptance tests. We created, launched and operated state-of-the-art Space peta-scale research cloud storage and Time tera-scale research cloud compute in 2015. Intersect successfully competed against multinationals Amazon, Google, Microsoft, IBM and VMware with research oriented first-to-market offerings like cloud MPI and GPU computing-as-a-service, and elastic big data-as-a-service. Monetisation of these products critically enabled growth and generated revenue, securing financial sustainability during an extremely volatile research funding period through 2020." – Marc Bailey CIO