The Real Chrysler Miracle | 100x Productivity Gains
Updated: Aug 11, 2020
Once upon a time, not so long ago, there was an IT breakthrough at Chrysler. No, I am not talking about XP. I'm talking about a breakthrough you've never heard of, until now. This breakthrough happened shortly after the XP team was disbanded, in a different area of the business. It was lead by a consultant, Dr. Bird, who mostly worked alone. He never wrote a book about what he did. He wanted to keep his approach a trade secret, and he succeeded.
I am writing about it now almost 20 years later in the hopes it might inspire someone else.
Dr. Bird developed and deployed a fully functional software package to monitor factory operations. He did this project in 8 person-weeks. He replaced a ongoing initiative which failed to deliver successfully after 7 person-years of development. This productivity miracle puts even XP to shame.
In actuality, he replaced about 25 or 30 person-years of development effort in 8 person-weeks. You see, two Chrysler vendors had already tried and failed previously to build this plant monitoring and data gathering system. It was a third team, which I had a distant relationship with, which was about to fail a third time when our hero entered the scene. Like the previous two teams, the software wasn't going to be ready in time for several new manufacturing plants which needed it to open.
The third failing team had built yet another system (in Java) which was complex, slow, and buggy, and dangerously behind schedule. The project involved data collection for manufacturing plants. Not just a little data, but a great pile of data for every part which went through the plant and for every machine which touched the part (up to hundreds of machines per facility).
To make it more interesting, the data being collected would be different for different machines, and the data coming from a machine may change over time.
Chrysler needed the system to know lots of different things, which machines touched each part, how many machine cycles the machine had made when it touched a part, settings on the machine, machine cycle time, and on and on and on. And this was just the data collection. Once the system had the data, they needed to provide alerts on certain conditions, warnings on others, and reports and real-time views on it all. And these needed to be able to be updated fast, almost in real-time, as machines joined and left the floor, and as their settings changed.
Chrysler was so desperate to get the most recent "system fixed" they gave a lone consultant a long-shot at fixing the problem. He had a PhD, taught compiler construction at the University of Michigan, and came highly recommended... by me. My main role in this? I encouraged the experiment, "See if Peter can help."
Dr. Bird decided the best way to "fix" the system was to produced a completely new version of the system, which he did, almost entirely by himself, in two months. It was about 4000 lines of Java code, and 750 compiler productions.
Yes folks, he wrote four different domain specific languages (DSLs) for this project. He wrote grammars. And he then compiled the requirements directly into Java. "All computation is translation," he liked to say.
Fortunately for him, and part of why this went so fast, is that the requirements were very neatly documented in four different spreadsheets. He was able to rapidly clean the spreadsheets up and turn them into four complete sets of properly formed DSL sentences. Then he simply taught to sentences to execute, without rewriting them in code.
This is what his Chrysler manager later told me....
"A little less than two months later, this guy tossed me the product release, I had to laugh, he emailed me the release! How could something that will fit in my email possibly accomplish even a fraction of the requirements! I chuckled under my breath, and performed the install.
Then it hit me! This was working, just like he said it would.
And, I didn’t have to code the changes for each machine; it would create what was needed from the machine specifications!
I am always the skeptic, so I presented my first challenge: O.K., smart guy, each minute on this line costs $1000.00, sure you are collecting the down time, but I want to split the cost between every “Downed” machine on the line, on a per minute basis, and at the end of the day, I want to know which machine cost me the most dollars in productivity.
The next morning, I opened my E-Mail to find a configuration file.
I had him now….he forgot to send me the new release! A quick trip by his desk left me shaking my head, I didn’t need a new release, all I needed to do was apply my new business rules to the existing system!
My friend, this is agile.
This is what development was supposed to do for us.
I threw seven man-years of code in the trash that day, and launch a great system at a fraction of the cost. Using traditional methods, I could have spent years trying to design a system just to tell me the high cost machine on my lines, I got it from this system in a single day."
Basically, Chrysler was shown by a lone consultant on a critical project a totally new way to develop industrial software leveraging DSLs and Java. A way that INSANELY reduced cost, improved quality, and supercharged maintenance requests and new features.
The software was successfully deployed across multiple plants in time for their opening. After person-years of wasted effort, the project was delivered in just eight person-weeks (not counting the very important time spent building the original spreadsheets).
Chrysler kept Peter around for the better part of a year during which he added more features the dreamed up, frequently by only changing "the config file." He also reduced the number of lines of Java code to just under 2000, telling me "It would have been shorter to begin with but I didn't have much time."
What Peter did wasn't a one-off. I watched him do this four different times in four different industries. He was convinced most systems could be made this way. But Chrysler didn't pay attention to the breakthrough, and Peter, unlike Kent Beck with XP, keep his approach a trade secret. There is a little irony here. Peter hid his success. Kent Beck bragged about his failure. Go figure.
As some of you have surmised already, Peter's tools of choice for this project were Lex and Yacc (Yet Another Compiler-Compiler). Instead of compiling to machine code to execute his primitives, he compiled to Java, which allowed for more complex primitives and a "run anywhere" solution.
One reason Peter was able to do his project so fast and efficiently is an analyst at Chrysler had neatly captured the core system requirements in a formal row and column structure of spreadsheets. This itself was a primitive grammar. It inspired Peter to think, "With a little clean up, I think I can compile these spreadsheet data directly into Java."
So what was in those four spreadsheets? Unfortunately, I never saw them myself. We don't know, and Chrysler threw this all away a long time ago. But based on how Peter talked about DSLs, and guessing the needs of a factory floor from lean, I will take an educated guess at what they likely contained.
Spreadsheet #1: Protocol
Protocol is "a set of conventions governing the treatment and especially the formatting of data in an electronic communications system." It seems highly likely that one of the spreadsheets described communication protocols of the various machines on the factory floor. The spreadsheet likely included the name of the machine, model, and address of the machine on the network, as well as a description of the various messages the machine could send and receive. Seems reasonable.
Spreadsheet #2: Routing
Routing is the forwarding of messages to new locations. A primary routing responsibility of this spreadsheet is probably telling us where different types of messages should go next. I suspect this spreadsheet routed information to text messages, email, Andon boards, and most importantly long-term data store. Peter later talked about routing as a key DSL. He also realized he could embed SQL directly into the cell of spreadsheet to describe how a messages is moved to long-term data store. This is very clever, very clean, and very fast.
Spreadsheet #3: Rete
Peter also talked about rete as a key DSL for all systems. A Rete algorithm "is a pattern matching algorithm for implementing rule-based systems." I suspect they had a spreadsheet describing rules for when to take various actions. A Rete rule may be triggered by a new message alone, content in the long-term data store, or some combination. As new messages were created by rete they would be sent on to routing.
It is in the rete and routing spreadsheets (configuration files as the client called them) that Peter likely made the change overnight to answer the Chrysler manager's newly conceived question about costs.
If the grammar in this spreadsheet is general enough, and it was, new rules can be created and executed in minutes which might take a traditional software delivery team days or weeks or months to implement and test.
Spreadsheet #4: Reporting and Events?
Alright, I don't really have a guess what was in this spreadsheet. Perhaps reporting and events? You can guess what reports are, and events may be other additional compound constructs, perhaps temporal, which trigger rete and routing to do their magic.
"At its heart, this grammar is not meant to be a secret code, but instead it is an open code, in that its meaning is meant to be able to be "read" by all, and dealt with in the sense of it being understandable and actionable. The only interpretation possible is through Context (Dialect and Goal) and Glossary (definitions).
I don't believe the context should extend down into the sentence, at least from a human perspective. It might be necessary from a system perspective to understand what the word type is, but that isn't necessary to spell out in the document. " -Dr. Peter Bird
I bumped into Peter at work, and he told me about his progress on his Chrysler project, he was busy doing his YACC to Java productions, and writing the few thousand lines of Java code. The crunch was on, as the deadline for opening factories was just a few months away.
He told me he thought he was going to finish the rewrite in the next week or so. I was impressed. I was also slightly concerned about quality, you know, when a huge multi-person-year project is rewritten by one programmer in just a few weeks.
"Peter, you know how in XP we are writing unit tests to cover all of the production code. Could you do that in your system?"
"I think so." He replied, "Although it will be different. I may be able to automatically generate the tests I need."
"It would be great." I encouraged.
I knew with Peter a sound idea would not take much more encouraging, and I thought my idea was sound, so I said no more.
I really loved the results of my teams who were doing Test Driven Development. Peter had written unit tests for his Java code, but I was interested in how he would test the generated code. Earlier Peter had explained to me that with translation you could mathematically prove the soundness of a grammar. He could literally write grammars guaranteed to be error free in the translation. He had not, however, done the mathematical proofs for this code. So the additional testing was another approach.
A month later I ran into Peter again. He was all smiles.
"It worked flawlessly," he said. "The customer almost didn't believe it."
"You replaced the previous system?" I asked. "The whole thing worked?"
"Yes. Not a single defect."
"Did you do automated tests?"
"Yes." he replied, "I wrote a special grammar to automatically generate tests. I ran them the night before the release. I ran about 350 million tests through the system."
"What? How many tests?"
"About 350 million different tests. They all passed, so I emailed the release to the customer."
I was so fixated on the 350 million tests I missed the fact that his executable, for this extremely complex and important system, was so small it could be attached to an email. Oh, and as cool as those 350 million tests were they found no defects... because there were no defects.
"My friends, this is Agile."-Chrysler Manager Writing About Peter and Grammars