All posts by Scott Alan Miller

Titanic Project Management & Comparison with Software Projects

Few projects have ever taken on the fame and notoriety of that achieved by the Titanic and her sister Olympic ships, the Olympic and the Britannic, which began design one hundred and ten years ago this year.  There are, of course, many lessons that we can learn from the fate of the Olympic ships in regards to project management and, in fact, there are many aspects of project management that are worth covering.

(When referring to the ships as a whole I will simply reference them as The Olympics as the three together were White Star Line’s Olympic Class ships.  Titanic’s individual and latter fame is irrelevant here.  Also, I am taking the position here that the general information pertaining to the Olympic ships, their history and fate are common knowledge to the reader and will not cover them again.)

Given the frequency with which the project management of the Olympics has been covered, I think that it is more prudent to look at a few modern parallels where we can view current project management in today’s world through a valuable historic lens.  It is very much the case that project management is a discipline that has endured for millennia and many of the challenges, skills and techniques have not changed so much and the pitfalls of the past still very much apply to us today.  The old adage applies, if we don’t learn from the past we are doomed to repeat it.

My goal here, then, is to examine the risk analysis, perception and profile of the project and apply that to modern project management.

First, we must identify the stakeholders in the Olympics project. White Star Lines itself (sponsoring company and primary investor) and its director Joseph Bruce Ismay, Harland-Wolff (contracted ship builder) with its principle designers Alexander Carlisle and Thomas Andrews, the ships’ crew which includes Captain Edward John Smith, the British government as we will see later and, most importantly, the passengers.

As with any group of stakeholders there are different roles that are played.  White Star on one side is the sponsor and investor and in a modern software project would be analogous to a sponsoring customer, manager or department.  Harland-Wolff were the designers and builders and were most closely related to software engineering “team members” in a modern software team, the developers themselves.  The crew of the ships were responsible for operations after the project was completed and would be comparable to an IT operations team taking over the running of the final software after completion.  The passengers were much as end users today, hoping to benefit from both the engineering deliverable (ship or software) and the service build on top of that product (ferry service or IT managed services.) (“Olympic”)

Another axis of analysis of the project is that of chicken and pig stakeholders where chickens are invested and carry risk while pigs are fully invested and carry ultimate risk.  In normal software we use these comparatives to talk about degrees of stakeholders – those which are involved versus those that are committed, but in the case of the Olympic ships these terms take on new and horrific meaning as the crew and passengers literally put their lives on the line in the operational phase of the ships, whereas the investors and builders were only financially at risk. (Schwaber)

Second, I believe that it is useful to distinguish between different projects that exist within the context of the Olympics.  There was, of course, the design and construction of the three ships physically.  This is a single project, with two clear components – one of design and one of construction.  And three discrete deliverables, namely the three Olympic vessels.  There is, at the end of the construction phase, an extremely clear delineation point where the project managers and teams involved in the assembly of the ship would stop work and the crew that operated the ship would take over.

Here we can already draw an important analogue to the modern world of technology where software products are designed and developed by software engineers and, when they are complete, are handed over to the IT operational staff who take over the actual intended use of the final product.  These two teams may be internal under a single organizational umbrella or from two, or more, very separate organizations.  But the separation between the engineering and the operational departments has remained just as clear and distinct in most businesses today as it was for ship building and ferry service over a century ago.

We can go a step farther and compare White Star’s transatlantic ferry service to many modern software as a service vendors such as Microsoft Office 365, Salesforce or G Suite.  In these cases the company in question has an engineering or product development team that creates the core product and then a second team that takes that in-house product and operates it as a service.  This is increasingly an important business model in the software development space that the same company creating the software will be the ultimate operator of it, but for external clients.  In many ways the relevance of the Olympics to modern software and IT is increasing rather than decreasing.

This brings up an important interface understanding that was missed on the Olympics and is often missed today: each side of the hand-off believed that the other side was ultimately responsible for safety.  The engineers touted their safety of design, but when pushed were willing to compromise assuming that operational procedures would mitigate the risks and that their own efforts were largely redundant.  Likewise, when pushed to keep things moving and make good time the operations team were willing to compromise on procedures because they believed that the engineering team had gone so far as to make their efforts essentially wasted, the ship being so safe that operational precautions just were not warranted.  This miscommunication took the endeavor from having two types of systems of extreme safety down to basically none.  Had either side understood how the other would or did operate, they could have taken that into account.  In the end, both sides assumed, at least to some degree, that safety was the “other team’s job”.  While the ship was advertised heavily based on safety, the reality was that it continued the general trend of the past half century plus, where each year ships were made and operated less safely than the year before. (Brander 1995)

Today we see this same problem arising between IT and software engineering – less around stability (although that certainly remains true) but now about security, which can be viewed similarly to safety in the Olympics’ context.  Security has become one of the most important topics of the last decade on both sides of the technology fence and the industry faces the challenges created by the need for both sides to action security practices thoroughly – neither is capable of truly implementing secure systems alone. Planning for safety or security is simply not a substitute for enforcing it procedurally during operations.

An excellent comparison today is British Airways and how they approach every flight that they oversee as it crosses the Atlantic.  As the primary carrier of air traffic over the North Atlantic, the same path that the Olympics were intended to traverse, British Airways has to maintain a reputation for excellence in safety.  Even in 2017, flying over the North Atlantic is a precarious and complicated journey.

Before any British Airways flight takes off, the pilots and crew must review a three hundred page mission manual that tells them everything that is going on including details on the plane, crew, weather and so forth.  This process is so intense that British Airways refuses to even acknowledge that it is a flight, but officially refers to every single trip over the Atlantic as a “mission”; specifically to drive home to everyone involved the severity and risk involved in such an endeavor.  They clearly understand the importance of changing how people think about a trip such as this and are aware of what can happen should people begin to assume that everyone else will have done their job well and that they can cut corners on their own job.  They want no one to become careless or begin to feel that the flight, even though completed several times each day, is ever routine. (Winchester)

Had the British Airways approach been used with the Titanic, it is very likely that disaster would not have struck when it did.  The operational side alone could have prevented the disaster.  Likewise, had the ship engineers been held to the same standards as Boeing or AirBus today they likely would not have been so easily pressured by management to modify the safety requirements as they worked on the project.

What really affected the Olympics, in many ways, was a form of unchecked scope creep.  The project began as a traditional waterfall approach with “big design up front” and the initial requirements were good with safety playing a critical role.  Had the original project requirements and even much of the original design been used, the ships would have been far safer than they were.  But new requirements for larger dining rooms or more luxurious appointments took precedence and the scope and parameters of the project were changed to accommodate these new changes.  As with any project, no change happens in a vacuum but will have ramifications for other factors such as cost, safety or delivery date. (Sadur)

The scope creep on the Titanic specifically was dramatic, but hidden and not necessarily obvious for the most part.  It is easy to point out small changes such as a shift of dining room size, but of much greater importance was the change in the time frame in which the ship had to be delivered.  What really altered the scope was actually that initial deadlines and projects had to be maintained, relatively strictly.  This was specifically problematic because in the midst of Titanic’s dry dock work and later moored work, the older sibling, Olympic, was brought in for extensive repairs multiple times which had a very large impact on the amount of time in the original schedule available for Titanic’s own work to be completed.  This type of scope modification is very easy to overlook or ignore, especially in hindsight, as the physical deliverables and the original dates did not change in any dramatic way.  For all intents and purposes, however, Titanic was rushed through production much faster than had been originally planned.

In modern software engineering it is well accepted that no one can estimate the amount of time that a design task will take as well as the engineer(s) that will be doing the task themselves.  It is also generally accepted that there is no means of significantly speeding up engineering and design efforts through management pressure. Once a project is running at maximum speed, it is not going to go faster.  Attempts to go faster will often lead to mistakes, oversights or misses.  We know this to be true in software and can assume that it must have been true for ship design as well as the principles are the same.  Had the Titanic been given the appropriate amount of time for this process, it is possible that safety measures would have been more thoroughly considered or at least properly communicated to the operational team at hand off.  Teams that are rushed are forced to compromise and since time cannot be adjusted as it is the constraint, the corners have to be cut somewhere else and, almost always that comes from quality and thoroughness.  This might manifest itself as a mistake or perhaps as failing to fully review all of the factors involved when changing one portion of a design.

This brings us to holistic design thinking. At the beginning of the project the Olympics were designed with safety in mind: safety that results from the careful inter-workings of many separate systems that together are intended to make for a highly reliable ship.  We cannot look at the components of a ship of this magnitude individually, they make no sense – the design of the hull, the style of the decks, the weight of the cargo, the materials used, the style of the bulkheads are all interrelated and must function together.

When the project was pushed to complete more quickly or to change parameters this holistic thinking and a clear revisiting of earlier decisions was not done or not done adequately.  Rather, individual components were altered irrespective of how that would impact their role without the whole of the ship and the resulting impact to overall safety.  What may have seemed like a minor change had unintended consequences that were unforeseen because holistic project management was abandoned.  (Kozak-Holland)

This change to the engineering was mirrored, of course, in operations.  Each change, such as not using binoculars or not taking ice bucket readings, were individually somewhat minor, but taken together they were incredibly impactful.  Likely, but we cannot be sure, a cohesive project management or, at least, process improvement system was not being used.  Who was overseeing that binoculars were used, that the water tests were accurate and so forth?  Any check at all would have revealed that the tools needed for those tasks did not exist, at all.  There is no way that so much as a simple test run of the procedures could have been performed, let alone regular checking and process improvement.  Process improvement is especially highlighted by the fact that Captain Smith had had practice on the RMS Olympic, caused an at-sea collision on her fifth voyage and then nearly repeated the same mistake with the initial launch of the Titanic.  What should have been an important lesson learned by all captains and pilots of the Olympic ships instead was ignored and repeated, almost immediately. (“Olympic”)

Of course ship building and software are very different things, but many lessons can be shared.  One of the most important lessons is to see the limitations faced by ship building and to recognize when we are not forced to retain these same limitations when working with software.  The Olympic and Titanic were built nearly at the same time with absolutely no time for engineering knowledge gleaned from the Olympic’s construction, let alone her operation, to get to be applied to the Titanic’s construction.  In modern software we would never expect such a constraint and would be able to test software, at least to some small degree, before moving on to additional software that is based upon it either in real code or even conceptually.  Project management today needs to leverage the differences that exist both in more modern times and in our different industry to the best of its advantage.  Some software projects still do require processes like this but these have become more and more rare over time and today are dramatically less common than they were just twenty years ago.

It is well worth evaluating the work that was done by Harland-Wolff with the Olympics as they strove very evidently to incorporate what feedback loops were possible within their purview at the time.  Not only did they attempt to use the construction of earlier ships to learn more for the later ones, although this was very limited as the ships were mostly under construction concurrently and most lessons would not have had time to have been applied, but far more importantly they took the extraordinary step of having a “guarantee group” sail with the ships.  This guarantee group consisted of all manner of apprentice and master ship builders from all manner of support trades.  (“Guarantee Group”)

The use of the guarantee group for direct feedback was, and truly remains, unprecedented and was an enormous investment in hard cost and time for the ship builders to sacrifice so many valuable workers to sale in luxury back and forth across the Atlantic.  The group was able to inspect their work first hand, see it in action, gain an understanding of its use within the context of the working ship, work together on team building, knowledge transfers and more.  This was far more valuable than the feedback from the ship yards where the ships were overlapping in construction, this was a strong investment in the future of their ship building enterprise: a commitment to industrial education that would likely have benefited them for decades.

Modern deployment styles, tools and education have led from the vast majority of software being created under a Waterfall methodology not so distinct from that used in turn of the [last] century shipbuilding, to most leveraging some degree of Agile methodologies allowing for rapid testing, evaluation, changes and deployment.  Scope creep has changed from something that has to be mitigated or heavily managed to something that can be treated as expected and assumed within the development process even to the point of almost being leveraged.  One of the fundamental problems with big design up front is that it always requires the customer or customer-role stakeholder to make “big decisions up front” which are often far harder for them to make than the design is for the engineers.  These early decisions are often a primary contributor to scope creep or to later change requests and can often be reduced or avoided by agile processes that expect continuous change to occur to requirements and build that into the process.

The shipbuilders, Harlan and Wolff, did build a fifteen foot model of the Olympic for testing which is useful to some degree, but of course failed to mimic the hydrological action that the full size ship would later produce and failed to predict some of the more dangerous side effects of the new vessel’s size when close to other ships which led to the first accident of the group and to what was nearly a second.  The builders do appear to have made every effort to test and learn at every stage available to them throughout the design and construction process. (Kozak-Holland)

In comparison to modern project management this would be comparable to producing a rapid mock-up or wireframe for developers or even customers to get hands-on experience with before investing further effort into what might be a dead end path for unforeseen reasons.  This is especially important in user interface design where there is often little ability to properly predict usability or satisfaction ratings without providing a chance for actual users to physically manipulate the system and judge for themselves if it provides the experience for which they are looking. (Esposito)

We must, of course, consider the risk that the Olympics undertook within the context of their historical juxtaposition in regards to financial trends and forces.  At the time, starting from the middle of the previous century, the prevailing financial thinking was that it was best to lean towards the risky, rather than towards the safe – in terms of loss of life, cargo or ships; and to overcome the difference via insurance vehicles.  It was simply too financially advantageous for the ships to operate in a risky manner than to be overly cautious about human life. This trend, by the time of the Olympics, had been well established for nearly sixty years and would not begin to change until the heavy publicity of the Titanic sinking.  The market impact to the public did not exist until the “unsinkable” ship, with so many souls aboard, was lost in such a spectacular way.

This approach to risk and its financial trade offs is one that project managers must understand today the same as they did over one hundred years ago.  It is easy to be caught believing that risk is so important that it is worth any cost to eliminate, but projects cannot think this way.  It is possible to expend unlimited resources in the pursuit of risk reduction.  In the real world it is necessary that we balance risks with the cost of risk mitigation.  A great example of this in modern times, but outside that of software development specifically, is in the handling of credit card fraud in the United States.  Until just the past few years, it has generally been the opinion of the US credit card industry that the cost of greater security measures on credit cards to prevent theft were too high compared to the risks of not having them; essentially it has been more cost effective to spend money in reimbursing fake transactions than it was to prevent those fake transactions. This cost to risk ratio can sometimes be counterintuitive and even frustrating, but is one that has to drive project decisions in a logical, calculated fashion.

In a similar vein, it is common in IT to design systems believing that downtime is an essentially unlimited cost and to spend vastly more attempting to mitigate a downtime risk than the cost of the actual outage event itself would likely be if it were to occur.  This is obviously foolish, but so rarely are cost analysis of this type run or run correctly it becomes far too easy to fall prey to this mentality.  In software engineering projects we must approach risks in a similar fashion.  Accepting that there is risk, of any sort, and determining the actual risk, the magnitude of the impact of that risk and comparing that against the cost of mitigation strategies is critical to making an appropriate project management decision in regards to the risk. (Brander 1995)

Also of particular interest to extremely large projects, of which the Olympics certainly qualified, there is an additional concept of being “too big to fail.”  This, of course, is a modern phrase that came about during the financial crisis of the past decade, but the concept and the reality of this is far older and a valuable consideration to any project that falls onto a scale that would register a “national financial disaster” should the project totally falter.  In the case of the Olympics the British government ultimately insulated the investors from total disaster as the collapse of one of the largest passenger lines would have been devastating to the country at the time.

White Star Lines was simply “too big to fail” and was kept afloat, so to speak, by the government before being forcibly merged into Cunard some years later.  This concept, knowing that the government would not want to accept the risks of the company failing, may have been calculated or considered at the time, we do not know.  We do know, however, that this is taken into consideration today with very large projects.  An example of this happening currently is that of Lockheed Martin’s F-35 fighter which is dramatically over budget, past its delivery date and no longer even considered likely to be useful has been buoyed for years, but different government sponsors who see the project as too important, even in a state of failure to deliver, for the national economy to allow the project to fully collapse.  As this phenomenon becomes better and better known, it is likely that we will see more projects take this into consideration in their risk analysis phases. (Ellis)

Jumping to the operational side of the equation we could examine any number of aspects that went wrong leading to the sinking of the Titanic, but at the core I believe that what was most evident was a lack of standard operating procedures throughout the process.  This is understandable to some degree as the ship was on its maiden voyage and there was little time for process documentation and improvement.  However this was the flagship of a long standing shipping line that had a reputation to uphold and a great deal of experience in these matters.  It would also overlook that by the time that Titanic was attempting its first voyage that the Olympic had already been in service far more than enough to have developed a satisfactory set of standard operating procedures.

Baseline documentation would have been expected even on a maiden voyage, it is unreasonable to expect a ship of such scale to function at all unless there is coordination and communication among the crew.  There was plenty of time, years in fact, for basic crew operational procedures to be created and prepared before the first ship set sale and, of course, this would have to be done for all ships of this nature, but it was evident that such operating procedures were lacking, missing and untested in the case of the Titanic.

The party responsible for operating procedures would likely be identified as being from the operations side of the project equation, but there would need to be some degree of such documentation provided by or coordinated with the engineering and construction teams as well.  Many of the procedures that broke done on the Titanic included chain of command failures under pressure with the director of the company taking over the bridge and the captain allowing it, wireless operators being instructed to relay passenger messages as a priority over iceberg warnings, allowing wireless operators to tell other ships attempting to warn them to stop broadcasting, critical messages not being brought to the bridge, tools needed for critical jobs not being supplied and so forth. (Kuntz)

Much like was needed with the engineering and design of the ships, the operations of the ships needed strong and holistic guidance ensuring that the ship and its crew worked as a whole rather than looking at departments, such as the Marconi wireless operators, as an individual unit.  In that example, they were not officially crew of the ship but employees of Marconi who were on board to handle paid passenger communiques and to only handle ship emergency traffic if time allowed.  Had they been overseen as part of a holistic operational management system, even as outside contractors, it is likely that their procedures would have been far more safety focused or, at the very least, that service level agreements around getting messages to the bridge would have been clearly defined rather than ad hoc and discretionary.

In any project and project component, good documentation whether of project goals, deliverables, procedures and so forth are critical and project management has little hope of success if good communications and documentation are not at the heart of everything that we do, both internally within the project and externally with stakeholders.

What we find today is that the project management lessons of the Olympic, Titanic and Britannic remain valuable to us today and the context of the era whether pushing for iterative project design where possible, investing in tribal knowledge, calculating risk, understanding the roles of system engineering and system operations or the interactions of protective external forces on product costs are still relevant.  The factors that affect projects come and go in cycles, today we see trends leaning towards models more like the Olympics than dislike them. In the future, likely, the pendulum will swing back again.  The underlying lessons are very relevant and will continue to be so.  We can learn much both by evaluating how our own projects are similar to those of White Star and how they are different to them.

Bibliography and Sources Cited:

Miller, Scott Alan.  Project Management of the RMS Titanic and the Olympic Ships, 2008.

Schwaber, Ken. Agile Project Management with Scrum. Redmond: Microsoft Press, 2003.

Kuntz, Tom. Titanic Disaster Hearings: The Official Transcripts of the 1912 Senate Investigation, The. New York: Pocket Books, 1998. Audio Edition via Audible.

Kozak-Holland, Mark. Lessons from History: Titanic Lessons for IT Projects. Toronto: Multi-Media Publications, 2005.

Brown, David G. “Titanic.” Professional Mariner: The Journal of the Maritime Industry, February 2007.

Esposito, Dino. “Cutting Edge – Don’t Gamble with UX—Use Wireframes.” MSDN Magazine, January 2016.

Sadur, James E. Home page. “Jim’s Titanic Website: Titanic History Timeline.” (2005): 13 February 2017.

Winchester, Simon. “Atlantic.” Harper Perennial, 2011.

Titanic-Titanic. “Olympic.” (Date Unknown): 15 February 2017.

Titanic-Titanic. “Guarantee Group.” (Date Unknown): 15 February 2017.

Brander, Roy. P. Eng. “The RMS Titanic and its Times: When Accountants Ruled the Waves – 69th Shock & Vibration Symposium, Elias Kline Memorial Lecture”. (1998): 16 February 2017.

Brander, Roy. P. Eng. “The Titanic Disaster: An Enduring Example of Money Management vs. Risk Management.” (1995): 16 February 2017.

Ellis, Sam. “This jet fighter is a disaster, but Congress keeps buying it.”. Vox, 30 January 2017.

Additional Notes:

Mark Kozak-Holland originally published his book in 2003 as a series of Gantthead articles on the Titanic:

Kozak-Holland, Mark. “IT Project Lessons from Titanic.” Gantthead.com the Online Community for IT Project Managers and later ProjectManagement.com (2003): 8 February 2017.

More Reading:

Kozak-Holland, Mark. Avoiding Project Disaster: Titanic Lessons for IT Executives. Toronto: Multi-Media Publications, 2006.

Kozak-Holland, Mark. On-line, On-time, On-budget: Titanic Lessons for the e-Business Executive. IBM Press, 2002.

US Senate and British Official Hearing and Inquiry Transcripts from 1912 at the Titanic Inquiry Project.

Standard Areas of Discipline Within IT

 

Information Technology and Business Infrastructure are an enormous field filled with numerous and extremely varied career opportunities not just in the industries in which work is done, but also in the type of work that is done. Only rarely are any two IT jobs truly alike. The variety is incredible. However, certain standard career foci do exist and should be understood and known to everyone in the field as they provide important terminology for mutual understanding.

It is very important to note that, like in any field, it is most common that a single person will do more than one role throughout their careers and even at the same time. Just as someone may be half time burger cook and half time cashier, someone may have their time split between different IT roles. But we need to know what those roles are and what they mean to be able to convey value, experience and expectation to others.

These are what we refer to as “IT Specializations” and are areas of specific focus and opportunity for deep skill within IT. These often do not just represent job roles within IT, but in large businesses generally are representative of entire departments of career peers who work together. None of these areas of focus is more or less senior to any other, these are different areas, not levels. There is no natural or organic progression from one IT discipline area to another, however all IT experience is valuable and it would be expected that experience in one discipline would prepare someone to more quickly learn and adapt to another area.

The terms “Administration” and “Engineering” are often applied today, these, again, are not levels, nor are they discipline areas. These refer to a role being focused on operations (the running of production systems) or on designing systems for deployment. These two share discipline areas. So, for example, the Systems discipline would have need for both administration and engineering workloads within it.

Systems. Shortened from “operating systems.” Systems roles are focused on the operating systems, normally of servers (but not necessarily in all cases.) This is the most broadly needed specialized IT role. Within systems, specializations tend to be such as Windows, RHEL, Suse, Ubuntu, AIX, HP-UX, Solaris, FreeBSD, Mac OSX and so forth. High level specializations such as UNIX are common with a single person or department servicing any system that falls under that umbrella, or larger organizations might split AIX, Solaris, RHEL and FreeBSD into four discrete teams to allow for a tight focus on skills, tools and knowledge. Systems specialists provide the application platform on which computer programs (which would also include databases) will run. Desktop support is generally seen as being a sub-discipline of systems, and one that often intersects pragmatically with end user and helpdesk roles.

Platforms. Also known as virtualization or cloud teams (depending on exact role), the platform discipline focues on the abstraction and management (hypervisor) layer that sits, or can sit, between physical hardware and the operating system(s). This team tends to focus on capacity planning, resource management and reliability. Foci within platform specialization would commonly include VMware ESXi, vCloud, Xen, XenServer, KVM, OpenStack, CloudStack, Eucalyptus, Hyper-V and so forth. With the advent of massively hosted platforms, there has also arisen a need for foci on specific hosted implementations of platforms such as Amazon AWS, Microsoft Azure, Rackspace, Softlayer and so on.

Storage. Storage of data is so critical to IT that it has filtered out as its own, highly focused discipline. Storage specialist generally focus on SAN, NAS and object store systems. Focus areas might include block storage in general, or might drill down to specific product or product lines, such as EMC VMAX or HPE 3PAR. With recent growth in scale out storage technologies, the storage arena is growing both in total size as well as in depth of skill expectation.

Databases. Similar to storage, databases provide critical “backing” of information to be consumed by other departments. While conceptually databases and storage overlap, in practicality the two are separated dramatically in how they are treated. We think of storage as “dumb”, “unstructured” or “bulk” storage and database as “smart”, “focused” or “highly structured” storage. At their fundamental level, the two are actually quite hard to distinguish. In practice, they are extremely different. Database specialists work specifically on database services, but rarely create databases and certainly do not code database connected applications. Like their systems counterparts, database specialists (often called DBAs) manage the database platform for other teams to consume. Database foci could be high level such as relational databases or non-relational (NoSQL) databases. Or, more commonly, a DBA would focus on one or more very specific database applications such as Informix, MS SQL Server, DBase, Firebird, PostgreSQL, MariaDB, MySQL, MongoDB, Redis, CouchDB, and many more.

Applications. Applications are the final product that consumes all other platform components from physical systems, platforms, systems, storage, databases and more. Applications are the ultimate component of the computational stack and can take a massive variety of forms. Application specialists would never use that term but would be referred to as a specialist on a specific application or set of applications. Some application families, such as CRM and ERP, as so large that an entire career might be spent learning and supporting a single one (such as an SAP ERP system.) While in many other cases one might manage and oversee hundreds of small applications over a career span. Common application areas include CRM, ERP, email, web portals, billing systems, inventory tracking, time tracking, productivity and so much more. Applications could include just about anything and while some are high provide, such as an Exchange email system; others might be very trivial such as a small desktop utility for calculating mortgage rates quickly.

Networking. Networks tie computers together and require much design and management on their own making them often the second largest discipline within IT. Network specialists work on buses, hubs, switches, routers, gateways, firewalls, unified thread management devices, VPNs, network proxies, load balancers and other aspects of allowing computers to speak to each other. Networking specialists typically focus on a vendor, such as Cisco or Juniper, rather than product types such as switches or routers. Networking is, with systems, the best well known or most commonly mentioned, role in IT even if the two are often confused. This role also supports the SAN (the actual network itself) for storage teams.

Security. Not truly an IT discipline itself, but rather an aspect that applies to every other role, IT Security specialists tend to either specialize by other discipline (network security, application security) or act as a cross discipline role with a focus on the security aspects as they cross those domains. Security specialists and teams might focus on proactive security, security testing and even social engineering.

Call Center, NOC or Helpdesk. The front line role for monitoring systems across other domains, taking incoming calls and emails and assisting in triage and sometimes direct support for an organization which may or may not include end users. This role varies heavily depending on who the direct “customer” of the service is, if tasks are interrupt (monitoring) based or if they are queue (ticket) based. Often the focus of this role is high level triage but can cross dramatically into end user support. This discipline is often seen as a “helper” group to other teams.

End User Support. Whether working sitting beside an end user in person (aka “deskside support) or remotely (aka helpdesk), end user support roles work directly with individual end users to resolve individual issues, communicate with other support teams, train and educate, and so forth. This is the only IT role that would commonly have any interaction with non-IT teams (unless reporting “up” in the organization to management.)

Hardware Technical Support. This role has no well known name and is often known only by the fact that it works with hardware. This role or family of roles includes the physical support and management of desktop or laptop devices, the support and management of physical servers, storage systems, or networking devices or the physical management of a datacenter or similar. This is the portion of IT that rubs shoulders with the “bench” field (considered to be outside of IT) and consists of much grey area overlapping with it. Hardware Support will often plug in and organize cables and generally works supporting other teams, predominately platforms or systems. Separating IT Hardware Support from Bench work is often nothing more than an “operational mindset” and most roles could potentially go in either direction. Placing desktops on desks is often seen as falling to bench, whereas racking, stacking and monitoring server hardware is generally seen as IT Hardware.

The Software RAID Inflection Point

In June, 2001 something amazing happened in the IT world: Intel released the Tualatin based Pentium IIIS 1.0 GHz processor. This was one of the first few Intel processors (IA32 architecture) to have crossed the 1 GHz clock barrier and the first of any significance. It was also special in that it had dual processor support and a double sized cache compared to its Coppermine based forerunners or it’s non-“S” Tualatin successor (that followed just one month behind.) The PIIIS system boards were insanely popular in their era and formed the backbone of high performance commodity servers, such as Proliant and PowerEdge, in 2001 and for the next few years culminating in the Pentium IIIS 1.4GHz dual processor systems that were so important that they resulted in kicking off the now famous HP Proliant “G” naming convention. The Pentium III boxes were “G1”.

What does any of this have to do with RAID? Well, we need to step back and look at where RAID was up until May, 2001. From the 1990s and up to May, 2001 hardware RAID was the standard for the IA32 server world which mainly included systems like Novell Netware, Windows NT 4, Windows 2000 and some Linux. Software RAID did exist for some of these systems (not Netware) but servers were always struggling for CPU and memory resources and expending these precious resources on RAID functions was costly and would cause applications to compete with RAID for access and the systems would often choke on the conflict. Hardware RAID solved this by adding dedicated CPU and RAM just for these functions.

RAID in the late 1990s and early 2000s was also very highly based around RAID 5 and to a lesser degree, RAID 6, parity striping because disks were tiny and extremely expensive for capacity and squeezing maximum capacity out of the available disks was of utmost priority and risks like URE were so trivial due to the small capacity sizes that parity RAID was very reliable, all things considered. The factors were completely different than they would be by 2009. In 2001, it was still common to see 2.1GB, 4.3GB and 9GB hard drives in enterprise servers!

Because parity RAID was the order of the day, and many drives were typically used on each server, RAID had more CPU overhead on average in 2000 than it did in 2010! So the impact of RAID on system resources was very significant.

And that is the background. But in June, 2001 suddenly the people who had been buying very low powered IA32 systems had access to the Tualatin Pentium IIIS processors with greatly improved clock speeds, efficient dual processor support and double sized on chip caches that presented an astounding leap in system performance literally over night. With all this new power and no corresponding change in software demands systems that traditionally were starved for CPU and RAM suddenly had more than they knew how to use, especially as additional threads were available and most applications of the time were single threaded.

The system CPUs, even in the Pentium III era, were dramatically more powerful than the small CPUs, which were often entry level PowerPC or MIPS chips, on the hardware RAID controllers and the available system memory was often much larger than the hardware RAM caches and investing in extra system memory was often far more effective and generally advantages so with the availability of free capacity on the main system RAID functions could, on average be moved from the hardware RAID cards to the central system and gain performance, even while giving up the additional CPU and RAM of the hardware RAID cards. This was not true on overloaded systems, those starved for resources and was more relevant for parity RAID systems with RAID 6 benefiting the most and non-parity systems like RAID 1 and 0 benefiting the least.

But June, 2001 was the famous inflection point – before that date the average IA32 system was faster when using hardware RAID. And after June, 2001 new systems purchased would on average be faster with software RAID. With each passing year the advantages have leaned more and more towards software RAID with the abundance of underutilized core CPUs and idle threads and spare RAM exploding with the only advantage towards hardware RAID being the drop in parity RAID usage as mirrored RAID took over as the standard as disk sizes increased dramatically while capacity costs dropped.

Today is has been more than fifteen years since the notion that hardware RAID would be faster has been retired. The belief lingers on due primarily to the odd “Class of 1998” effect. But this has long been a myth repeated improperly by those that did not take the time to understand the original source material. Hardware RAID continues to have benefits, but performance has not been one of them for the majority of the time that we’ve had RAID and is not expected to ever rise again.

Legitimate University Programs Are Not Certification Training

The university educational process is one that is meant to broaden the mind, increase exposure to different areas, teach students to think outside of the box, encourage exploration, develop soft skills, and to make students better prepared to tackle more learning such as moving on to trade skills needed for specific fields.  The university program, however, is not meant to provide trade skills themselves (the skills used in specific trades), that is the role of a trade school.   Students leaving universities with degrees are intended to not be employable due to specific skill sets learned at college, but to be well prepared to learn on the job or move on to additional education for a specific job.

In the last two decades, led primarily by for profit schools looking to make money quickly without regards to the integrity of the university system, there has been a movement, especially in the United States, for trade schools to get accredited (an extremely low bar requirement that has no useful standing outside of legal qualifications for educational minimums and should never be see as a mark of quality) and sell trade degrees as if they were traditional university degrees.  This has been especially prevalent in IT fields where certifications are broadly known and desired, acquiring properly skilled educational staff is expensive and essentially impossible to do at the scale necessary to run a full program, degree areas are easily misunderstood by those entering their college years and where the personality traits most common to people going into the field sadly makes those people easy prey for collegiate marketing drives.  The promise of easy classes, double dipping (getting the certs you need anyway then getting a bonus degree for the effort) and the suggestion that by having a degree and certs all at once will open doors and magically provide career options that pay loads of money triggers an emotional response that makes potential students less able to make rational financial and education decisions, additionally.  It’s a predatory market, not an altruistic one.

Certificates play a fundamentally different role than a university education does.  Unlike universities, certification is about testing very specific skills, often isolated by product or vendor, things that should never appear in any university program.  Certification may be broad (and closer to collegiate work) in certs like the CompTIA Network+ which tests a broad range of basic networking knowledge and nothing specific to a vendor or product, but is still overly specific to a single networking technology or group of technologies to be truly appropriate for a university, but is, at the very least, leaning in that direction.  But more common certifications such as Microsoft MCSE, Cisco’s CCNA, CompTIA’s Linux+ or A+ are all overly product and vendor specific, far too “which button do I press” and far too little “what does the underlying concepts mean” for collegiate work.

Certifications are trade related and a great addition to university studies.  University work should prepare the student for broad thinking, critical thinking, problem solving and core skills like language, maths and learning.  Then applying that core knowledge to certifications should make achieving certifications easier and meaningful.  University should show a background in soft skills and broadness, while certifications should show trade skills and specific task capabilities.

Warning signs that a university is behaving improperly would include, in regards to this area of concern, overly specific programs that sound as if they are aimed at technologies like a degree in “Cisco Networking” or “Microsoft Systems”, if certifications are achieved during the university experience (double dipping – giving out a degree simply for having gotten certs) or if the program leans towards an indication of preparing someone “for the job” or expected to “get the student a great job upon completion” or is expected to “increase salary”.  These are not goals of proper university programs.

Critically evaluating any educational program is very important as educational investments are some of the largest that we make in our lives, both monetarily and in terms of our time commitments.  Ensuring that the programs are legitimate, valuable, meet both our own goals and proper goals, will be seen as appropriate by those that will see them in the future (such as hiring managers) are very important.  There are many aspects over which we must evaluate the university experience, this is only one but it is one that is a newer problem, suddenly very prevalent and one that specifically targets IT and technical hopefuls so requires extra diligence in our industry.