A roundtable of experts discuss what the Y2K problem might mean to design engineers.
What will happen to computers worldwide on Jan. 1, 2000? Will their software crash or start generating gibberish when the calendar reads 01/01/00? And if the software goes belly up, will civilization come crashing down as well? Or will everything be hunky dory, with the exception of a handful of out-of-work computer consultants?
To try and answer those questions, Machine Design talked with several computer experts and Y2K gurus. While reassuring, their answers leave some room for caution.
Is the Y2K Problem real?
Raymond K. Neff: There is a problem and the source is absurdly simple: It was the limitations in the paper-punch card, the analog representation of digital data in the early days of computing. The card had only 80 spaces for data, so to save space, programmers routinely omitted the first two digits of the year. This habit stayed with us even as advances brought larger computer memories and more storage. It was extended to microprocessor chips, almost all of which process dates with two-digit years encoded in their hardware. So when the New Year comes, noncompliant computers, those that still rely on two-digit numbers for describing the year, will process the data for year “00” as representing 1900, and this could cause problems.
Ed Stengel: There are estimates that over 65% of the hardware sold in 1997 was noncompliant, and over 35% sold in 1998 was also noncompliant, so the scale of the problem is quite large and will effect large and small companies. Most larger companies, however, started working on the problem years ago.
Jon Arnold: Depending on the industry, the risks of Y2K can be great and if they are not understood and managed, they could seriously harm businesses, suppliers, the Government, or even the economy as a whole. In the electric-utility industry, we spend a tremendous amount of time and effort to ensure we have the most reliable electricity service in the world. Fortunately, Y2K’s impact on our ability to generate, transmit, and distribute electricity has been minimal. We have issues with monitoring and ancillary equipment but Y2K problems in critical systems are few. However, no industry is an island and we are dependent on suppliers (e.g., telecommunications, coal, gas, water) and are spending a great deal of time making sure all the critical systems and suppliers are ready.
Joel Orr: This question is still very controversial. There are authoritative people on both sides of the issue, or rather, all across the spectrum, from “the end of the world as we know it” to “no one will notice.”
As we get closer to the end of the year, I have found, interestingly, that there are fewer and fewer people on the “sky is falling” end of the scale. I have been involved with computers at fairly technical levels for all my professional life, and I find that most of my colleagues have difficulty believing that the problem is very great. There are several reasons for this.
First, most mission-critical systems are fairly easy to test for the problem. An organization would have to be extremely irresponsible not to have done so. Second, the phenomenon of relying completely on computers is fairly recent. In older systems, there are many noncomputer-based back-ups and many checks and balances built in. And third, relatively few systems have critical functions that depend on knowing the right date.
Those who believe strongly that there will be a big negative impact point to embedded systems — computers that are part of instrumentation and control systems like those that regulate nuclear and fossil-fuel power plants, and large manufacturing and chemical plants. These people make many statistical arguments, saying that there are literally billions of such systems, and even if only a small fraction of them failed, they would wreak havoc.
But the fact is that component failure is not unknown in any system. The electric company doesn’t stop working just because a control processor overheats. The potentially large problems come when a large number of such processors decide to fail at the same moment, something that is a statistical impossibility. The fear is that Y2K will produce precisely such an effect, with cascading results.
Personally, I’m on the end of the spectrum that believes there will be minimal negative impact from the Y2K problem itself. I do however have a concern about secondary effects, the result of people doing strange things “just in case,” especially with their bank deposits.
Maria Reeths: Yes, Y2K is a problem, but nowhere near as large as some people would like you to believe. The sky is not going to fall. It’s going to be more like a speed bump than a multicar pileup. There will be some problems, but life as we know it will not end.
Mark P. Haselkorn: Yes, it’s a real problem. But in January 1989, the National Institute of Standards and Technology changed its standard from two-digit to four-digit dates — more than a decade ago. So it’s not like we didn’t know about this. And the problems isn’t with computers, it’s with data. If you change your computer, but your data stays the same, then you haven’t addressed the problem. If your data is still two-digit-dated, you’ve still got it. The easy misconception is that this is about modernization. It’s not. It’s about maintaining your infrastructure and maintaining existing systems, not replacing equipment.
Who will be affected most?
Haselkorn: Companies using software and hardware that are date-data dependent and those that have a lot of interdependencies and exchange data across organizations. There are also suppliers and clients. They have systems you don’t control but your company needs those systems to function. That is going to cause problems for several companies, especially those with clients and suppliers in other countries.
Stengel: Small companies and large government agencies will be affected most. But individuals have the highest risk of failure because most will not spend the money to fix a problem they are not sure they have until it happens. Even then, many individuals don’t use date-sensitive software and won’t even realize the problem has occurred.
For large companies, however, the Gartner Group, a Stamford, Conn.-based consultancy has estimated the chance of critical failure. They put it at 15% for the banking industry, 33% for the retail industry, 50% for utilities, transportation and shipping industries, 66% for farming, agriculture & food processing industries, and 66% for government agencies and local municipal services
Is the media coverage of the Y2K problem hype or accurate reporting?
Orr: The actual glitch is now just a part of the problem. The media, especially the Web, is an enormous uncontrolled feedback system in which signals other than the truth, or what you intend to say, get amplified and distorted. And it seems these distorted truths, or rumors, spread much more quickly than facts.
Because many people don’t trust computers, and with good reason, it’s easy to believe that the mysterious priesthood that controls them is completely impractical and out of control.
Most importantly, big bad news sells lots of newspapers, or their Web equivalents. So those who have taken it upon themselves to be the town criers benefit more from spreading news of impending doom than from telling everyone that things can be worked out, if we all work together. Is it any wonder, then, that bad news gets top billing?
Haselkorn: The hype works two ways and you get it at both extremes. Either the world is coming to an end and we should head for the hills, which is coming from the Chicken Littles, or there are the Pollyanna’s saying everything is OK, this is all about consultants making money, there’s nothing to worry about here. Both of these are wrong, but both get the most press, and that’s because it’s easier to tell the extreme story rather then the difficult one that exists in the middle, that’s the hard story to tell.
But if trends continue, this will be the second most expensive event in human history, second only to World War II, especially if you include the liability costs. So even if nothing else happens and we spend a couple of hundred billion dollars, is that a big problem? Of course it is. The way the hype is, it’s only a problem if governments topple, airplanes fall out of the sky, or half the population suffers a power outage for a week or two
Arnold: The fact is, the Y2K situation is a problem and there were a number of folks who raised the issue and created awareness early on. Maybe they exaggerated. But in most cases it took a while for the message to sink in. I have been working on Y2K issues since 1994 with electric utilities and it took a while before Y2K became more than an Information Technology issue. Eventually it evolved to include embedded systems and took on a more corporate perspective.
Have the U.S. and its companies responded properly?
Stengel: Many U.S. companies and organizations have responded properly and currently have or will have the problem addressed for hardware and software before the end of this year. This is especially true with the banking sector. However, the majority of companies, primarily the smaller ones, have waited so long that once they try to solve the problem in the “11th hour,” either the hardware, software, or resources won’t be available to help them because of the demand.
Neff: Personally, I’m proud of the way IT professionals have worked together with the government to prepare for this problem. But many people believe the problems, if they occur, will happen in the first few weeks of the new year. But any system that uses date information needs to be checked. Car manufacturers, for example, are now designing cars with hundreds of chips and embedded systems, many of which use an internal clock. I believe there will be a host of recalls as car companies replace or update their systems.
And some software companies are still using two-digit dates. Microsoft, for example, released Excel Version 7 and Office 97 and 98 with two-digit dates for years. The numbers 00 through 29 represent years spanning 2000 to 2029, while 30 through 99 cover years 1930 to 1999. So these problems will be with us for as long as companies use two-digit dates.
Reeths: Both groups have responded well. Almost everyone who takes this seriously is sure of their own systems but not so sure of their suppliers and clients. Because liability and litigation are of concern, companies find it necessary to create massive paper trails that either state their own compliance or request statements of compliance from their vendors and customers.
Arnold: The President’s Council on Year 2000 Conversion has done a tremendous job over the past 18 months to mobilize the Federal Government, private industries, state and local entities, as well as raising the issue internationally. Because of their efforts, the U.S. has an acceptable level of awareness and readiness. Large companies and critical infrastructure industries have stepped up to meet the Y2K challenge. Everyone probably would have liked to have gotten an earlier start and there is much more work to be done, but spending a lot of time looking in the rearview mirror won’t solve any outstanding Y2K issues.
Orr: Most utility companies, banks, airlines, and other organizations that control computers on which people’s lives or comfort depend have taken some notice of the issue and taken some action.
Frankly, most companies are imperfect in matters of policy, safety, and “big issues.” But most manage to keep going because they have a small number of people who just do whatever it takes to get the job done, to make things right. I have a lot of faith in those people who still have a spark of the spirit that made America great.
Haselkorn: Yes and no, of course. On one hand, as IEEE/USA has recently said in its position statement on liability, there was never a possibility that any large, complex organization could fully protect itself from all Y2K business failures. It’s an impossibility. But companies should take steps to minimize harm and be prepared to respond quickly so that there’s not much of an impact.
And it’s not that the companies aren’t responding properly, it’s that the legal system doesn’t allow them to respond properly. The legal system says you have to worry about what you are responsible for, what you can control, and, most importantly, what you can be sued for. But this is largely a problem of dependency, and the biggest problems are under no one’s control; they fall at the borders and need to be negotiated. Therefore, focusing on things for which you are legally responsible is not the same as focusing on reducing harm. That’s a problem, and I can’t blame companies for that. They’d have to be crazy to give themselves a huge legal exposure. But that’s why we’re having all these debates in Congress about minimizing liabilities.
Will other countries have larger or smaller Y2K problems than the U.S.?
Arnold: It varies according to the level of automation that a country depends on. Many industrialized countries have high levels of automation and are putting forth significant Y2K efforts. Other countries may not have as much automation and the level of effort and awareness may be much lower. You have to remember that many developing countries experience infrastructure problems on a weekly basis. Reports from the CIA and U.S. State Dept. point to many possible problems and issues. From talking with U.S. companies that have facilities overseas, there tends to be the most concern about China, some of the former Soviet Block countries, and a few in South America.
Haselkorn: Most other countries have fewer resources to deal with the problem and started later. But they are less dependent on computers, so they have fewer systems to worry about. And some countries have so many other problems, Y2K can’t be on the top of their list. So some countries are likely to consciously neglect the Y2K issue or wait to see what happens before they come up with a plan for dealing with it. In some ways, this is a disease that we exported. It’s our equipment, it’s our software, we sold it to them, so perhaps rather than fix it, they’ll get angry at us. That’s one reason we are trying to help them.
Reeths: Countries with outdated systems, will have problems. And underdeveloped countries that don’t have many systems obviously won’t experience many problems.
Orr: It’s true that most other countries have fewer computer-dependent systems than we do, so the Y2K issue will be smaller in most other countries than it is here. The focus of speculation is, of course, on countries with large but poorly maintained infrastructures, like Russia. But their problems are already so great they will dwarf the Y2K bug.
Stengel: Most other countries will have much larger problems than the U.S. I would rate the Y2K readiness of countries, from most to least prepared, as follows: 1. U.S. and Canada, 2. Australia, 3. Britain, France, Germany and the rest of the Western European countries, 4. South America, 5. Middle East, 6. Russia and Eastern European countries, 7. China and Third World countries
What should companies be doing now?
Arnold: At this stage, companies should be finishing their Y2K remediation and testing work, and be fully engaged in contingency planning. In addition they need to make sure they have clear management procedures in place that will ensure that “clean” systems do not get contaminated with new Y2K bugs. Other action items include putting an event management team and process in place that will respond to problems during the Y2K event periods, and they should be communicating with customers and be involved in open and constructive dialog concerning their Y2K readiness and expectations.
From the standpoint of electric utilities, we believe Y2K risks are manageable. Electric utilities are diligently testing and correcting for anomalies, and these anomalies do not appear to impact electric operations. Contingency plans further minimize potential affects on customers. Customer Y2K problems due to electric power don’t appear worse than typical day-to-day risks.
Stengel: At this stage, companies should ensure that both their hardware and software are Y2K compliant. If they are not, then work as quickly as possible to ensure that they are.
Orr: The same thing they should have done years ago: Define a hierarchy of computer systems within their boundaries. Determine which are the most critical. Starting with those, test for Y2K problems. Where that is impractical, assume the worst and have alternate systems ready to cut in.
Reeths: Contingency planning should have been going on throughout year, and by now companies should be getting ready to do a readiness drill, a full-dress rehearsal. It should confirm procedures for contingencies are in place, so that if some Y2K failure occurs, you are ready to address them accordingly.
Haselkorn: The obvious one is contingency plans, but the one that people talk less about is preparation to respond, and it’s not quite the same. Companies should recognize that they are in a risk-management situation and be prepared to respond as quickly as possible where there are high risks to business functions.
The other key thing: Don’t insist things have to be a certain way. Don’t be too rigid. Take air traffic control. Those systems have problems all the time, but we have backups. Pilots go to line-of-sight separation rather than voice control; rather than landing on parallel runways every two minutes, planes land on the same runway every ten minutes. It’s still safe, it just slows down. The problem comes when someone decides they can’t live with those delays and they have to push the envelope. That’s when you’re going to have errors. We all have to be a little more flexible in the way we deal with this problem.
It is a potential crisis, so you need a little of that crisis mentality, you don’t drive fast in the middle of an ice storm, you change the way you drive. And that’s what we have to do.
What should individuals do?
Arnold: They need to do their own evaluation with the expectation there will probably be some issues that will affect them. Hopefully, most of these will be minor. There are organizations that specialize in consequence management like the American Red Cross that give guidance to individuals. Also, many state and local entities are giving out guidance on individual preparations. This past May the President’s Council on Year 2000 Conversion kicked off the Council’s Y2K Community Conversations campaign and is working to support the convening of hundreds of public meetings across the U.S. to focus on local Y2K readiness.
Orr: I’m not sure about “should;” I am reluctant to prescribe. My family and friends are acting as we would if we knew that a hurricane of possibly large (but not colossal) proportions was headed our way. We are making sure that we have water, food, and health supplies for two weeks. Others are stockpiling stores for three or four months.
Stengel: If individuals do not use date-sensitive software or do not care what date it is, then they could probably get away with doing nothing. Otherwise, they should ensure that their hardware and software are compliant. This can be confusing for individuals who are not computer “experts.”
The only advice that I would give is: Be very careful when implementing a “fix,” buying new hardware, or hiring a Y2K consultant. Many low-cost “fixes” on the market claim to solve the Y2K problem but they don’t offer any guarantee and will give individuals a false sense of security. Other “fixes” seem low-cost but require a highly paid computer professional to install them.
There are also a lot of so-called “Y2K Consultants” that are springing up on an almost daily basis. Some of these are legitimate and knowledgeable, while others are people simply trying to exploit an opportunity. Make sure you request and check the references and background of any Y2K consultant you approach for advice. And get it in writing. If you are buying a “fix,” a new computer, or new software, make sure you get a written statement guaranteeing that the product is Y2K compliant — not “Y2K ready” or “Y2K capable” — from the person you buy it from. If they will not give you a statement in writing, don’t buy it.
Haselkorn: People shouldn’t stress the system. They shouldn’t horde. They shouldn’t all try to take money out of the bank at the last minute. They shouldn’t do things that will make the problem more difficult.
Reeths: Prepare as if for a hurricane or storm. Have some extra cash, food, and medications. And make sure you have copies of important paperwork. But there’s no need to stockpile or buy guns, or empty bank accounts.
What will be the most catastrophic effects of the Y2K problem?
Stengel: No one knows what will really occur. Some people say nothing will happen and others are predicting the end of the world. Realistically, I do not think that there will be any major catastrophes in countries like the U.S. and Canada. However, I do have a concern that less prepared countries like China and Russia could have some serious problems that affect the rest of the world. These include problems with their financial service, and nuclear and military sectors. In North America there could be some minor power interruptions and possible problems with the retail sector (unprepared small businesses). For the most part, there won’t be any direct catastrophic effects since North America will be generally well prepared by the end of 1999.
Arnold: It appears very unlikely that there will be significant problems with critical infrastructure industries in the U.S. This includes electric and gas utilities, telecommunications and banking and financial industries. For electric utilities, Jan. 1, 2000 should be like any other Jan. 1 in terms of keeping the lights on.
Overall there will be some Y2K failures and inconveniences and these will be primarily on the information technology side of the problem. There are so many computers and software programs out there that we are bound to have a few problems, and some of the problems may not show up for some time. Expectations need to be set that some problems will surface in the U.S., but they don’t appear to be catastrophic. The international readiness factor could create some problems in terms of supply chain issues and in the financial sector. That may turn out to be the most vulnerable area.
Reeths: It will probability more of an annoyance than a problem. Maybe a wrongly calculated phone bill, not the power grid going down.
Orr: Martial law might be declared in some places, ostensibly in response to the Y2K situation. This is probably more likely than any really direct results of Y2K.
Haselkorn: The worst effects will be where the risks are greatest and that will probably be in international functions where there are dozens of companies working together, and they all have to function properly. Take the oil industry, for example. Production has to work in Venezuela, the ports have to work, the ships, refineries, and so on.
The message that hasn’t come across yet is that where life and limb are at risk — on airlines and elevators, for example — systems are well engineered and designed with failure in mind. Elevator brakes fail in the brake position and there are redundancies in airliners. We aren’t so stupid that we put our lives at risk on something without back ups.
The overall lesson is that computer software is generally not well engineered. It’s not designed with failure in mind. It’s not robust. And although there is something called computer engineering, right now it’s just lip service. We need real computer engineering.
But there is a silver lining to Y2K. It has highlighted our interdependence and the need for cooperation on a global scale. As a result, the U.S. invited the Russians into nuclear control areas in the U.S. and vice versa because we knew we needed to work together. It will also spur some basic changes in the way we manage our critical infrastructure which were going to come about one way or another. For example, we tend to think that maintenance is some low-level activity that you wouldn’t want your child to grow up and do. Instead, you want your children to be engineers and designers. After all, it’s the Ph.D.s who do the real work. Any high-school drop-out can do maintenance. Well, that’s just false and this proves it. This problem is about maintenance and how things evolve and are kept running in the real world.
We’re also going to learn something from this. We don’t spend $500 billion and not learn something.