Get more out of your legacy systems: more performance, functionality, reliability, and manageability
Is your code easy to change? Can you get nearly instantaneous feedback when you do change it? Do you understand it? If the answer to any of these questions is no, you have legacy code, and it is draining time and money away from your development efforts.
In this book, Michael Feathers offers start-to-finish strategies for working more effectively with large, untested legacy code bases. This book draws on material Michael created for his renowned Object Mentor seminars: techniques Michael has used in mentoring to help hundreds of developers, technical managers, and testers bring their legacy systems under control.
The topics covered include
- Understanding the mechanics of software change: adding features, fixing bugs, improving design, optimizing performance
- Getting legacy code into a test harness
- Writing tests that protect you against introducing new problems
- Techniques that can be used with any language or platformwith examples in Java, C++, C, and C#
- Accurately identifying where code changes need to be made
- Coping with legacy systems that aren't object-oriented
- Handling applications that don't seem to have any structure
This book also includes a catalog of twenty-four dependency-breaking techniques that help you work with program elements in isolation and make safer changes.
About the Author
MICHAEL C. FEATHERS works for Object Mentor, Inc., one of the world's top providers of mentoring, skill development, knowledge transfer, and leadership services in software development. He currently provides worldwide training and mentoring in Test-Driven Development (TDD), Refactoring, OO Design, Java, C#, C++, and Extreme Programming (XP). Michael is the original author of CppUnit, a C++ port of the JUnit testing framework, and FitCpp, a C++ port of the FIT integrated-testing framework. A member of ACM and IEEE, he has chaired CodeFest at three OOPSLA conferences.
Table of Contents
I. THE MECHANICS OF CHANGE.
1. Changing Software.
2. Working with Feedback.
3. Sensing and Separation.
4. The Seam Model.
II. CHANGING SOFTWARE.
6. I Don’t Have Much Time and I Have To Change It.
7. It Takes Forever To Make a Change.
8. How Do I Add a Feature?
9. I Can’t Get This Class into a Test Harness.
10. I Can’t Run This Method into a Test Harness.
11. I Need to Make a Change. What Methods Should I Test?
12. I Need to Make Many Changes In One Area Do I Have To Break.
13. I Need To Make a Change but I Don’t Know What Tests To Write.
14. Dependencies on Libraries Are Killing Me.
15. My Application Is All API Calls.
16. I Don’t Understand the Code Well Enough To Change It.
17. My Application Has No Structure.
18. My Test Code Is in the Way.
19. My Project Is Not Object-Oriented. How Do I Make Safe Changes?
20. This Class Is Too Big and I Don’t Want It to Get Any Bigger.
21. I’m Changing The Same Code All Over the Place.
22. I Need To Change a Monster Method and I Can’t Write Tests for It.
23. How Do I Know That I’m Not Breaking Anything?
24. We Feel Overwhelmed. It Isn’t Going To Get Any Better.
III. DEPENDENCY BREAKING TECHNIQUES.
25. Dependency Breaking Techniques.
Do you remember the first program you wrote? I remember mine. It was a little graphics program I wrote on an early PC. I started programming later than most of my friends. Sure, I'd seen computers when I was a kid. I remember being really impressed by a minicomputer I once saw in an office, but for years I never had a chance to even sit at a computer. Later, when I was a teenager, some friends of mine bought a couple of the first TRS-80s. I was interested, but I was actually a bit apprehensive, too. I knew that if I started to play with computers, I'd get sucked into it. It just looked too cool. I don't know why I knew myself so well, but I held back. Later, in college, a roommate of mine had a computer, and I bought a C compiler so that I could teach myself programming. Then it began. I stayed up night after night trying things out, poring through the source code of the emacs editor that came with the compiler. It was addictive, it was challenging, and I loved it.
I hope you've had experiences like thismdjust the raw joy of making things work on a computer. Nearly every programmer I ask has. That joy is part of what got us into this work, but where is it day to day?
A few years ago, I gave my friend Erik Meade a call after I'd finished work one night. I knew that Erik had just started a consulting gig with a new team, so I asked him, "How are they doing?" He said, "They're writing legacy code, man." That was one of the few times in my life when I was sucker-punched by a coworker's statement. I felt it right in my gut. Erik had given words to the precise feeling that I often get when I visit teams for the first time. They are trying very hard, but at the end of the day, because of schedule pressure, the weight of history, or a lack of any better code to compare their efforts to, many people are writing legacy code.
What is legacy code? I've used the term without defining it. Let's look at the strict definition: Legacy code is code that we've gotten from someone else. Maybe our company acquired code from another company; maybe people on the original team moved on to other projects. Legacy code is somebody else's code. But in programmer-speak, the term means much more than that. The term legacy code has taken on more shades of meaning and more weight over time.
What do you think about when you hear the term legacy code? If you are at all like me, you think of tangled, unintelligible structure, code that you have to change but don't really understand. You think of sleepless nights trying to add in features that should be easy to add, and you think of demoralization, the sense that everyone on the team is so sick of a code base that it seems beyond care, the sort of code that you just wish would die. Part of you feels bad for even thinking about making it better. It seems unworthy of your efforts. That definition of legacy code has nothing to do with who wrote it. Code can degrade in many ways, and many of them have nothing to do with whether the code came from another team.
In the industry, legacy code is often used as a slang term for difficult-to-change code that we don't understand. But over years of working with teams, helping them get past serious code problems, I've arrived at a different definition.
To me, legacy code is simply code without tests. I've gotten some grief for this definition. What do tests have to do with whether code is bad? To me, the answer is straightforward, and it is a point that I elaborate throughout the book:
Code without tests is bad code. It doesn't matter how well written it is; it doesn't matter how pretty or object-oriented or well-encapsulated it is. With tests, we can change the behavior of our code quickly and verifiably. Without them, we really don't know if our code is getting better or worse.
You might think that this is severe. What about clean code? If a code base is very clean and well structured, isn't that enough? Well, make no mistake. I love clean code. I love it more than most people I know, but while clean code is good, it's not enough. Teams take serious chances when they try to make large changes without tests. It is like doing aerial gymnastics without a net. It requires incredible skill and a clear understanding of what can happen at every step. Knowing precisely what will happen if you change a couple of variables is often like knowing whether another gymnast is going to catch your arms after you come out of a somersault. If you are on a team with code that clear, you are in a better position than most programmers. In my work, I've noticed that teams with that degree of clarity in all of their code are rare. They seem like a statistical anomaly. And, you know what? If they don't have supporting tests, their code changes still appear to be slower than those of teams that do.
Yes, teams do get better and start to write clearer code, but it takes a long time for older code to get clearer. In many cases, it will never happen completely. Because of this, I have no problem defining legacy code as code without tests. It is a good working definition, and it points to a solution.
I've been talking about tests quite a bit so far, but this book is not about testing. This book is about being able to confidently make changes in any code base. In the following chapters, I describe techniques that you can use to understand code, get it under test, refactor it, and add features.
One thing that you will notice as you read this book is that it is not a book about pretty code. The examples that I use in the book are fabricated because I work under nondisclosure agreements with clients. But in many of the examples, I've tried to preserve the spirit of code that I've seen in the field. I won't say that the examples are always representative. There certainly are oases of great code out there, but, frankly, there are also pieces of code that are far worse than anything I can use as an example in this book. Aside from client confidentiality, I simply couldn't put code like that in this book without boring you to tears and burying important points in a morass of detail. As a result, many of the examples are relatively brief. If you look at one of them and think "No, he doesn't understandmdmy methods are much larger than that and much worse," please look at the advice that I am giving at face value and see if it applies, even if the example seems simpler.
The techniques here have been tested on substantially large pieces of code. It is just a limitation of the book format that makes examples smaller. In particular, when you see ellipses (el) in a code fragment like this, you can read them as "insert 500 lines of ugly code here":m_pDispatcher->register(listener);...m_nMargins++;
If this book is not about pretty code, it is even less about pretty design. Good design should be a goal for all of us, but in legacy code, it is something that we arrive at in discrete steps. In some of the chapters, I describe ways of adding new code to existing code bases and show how to add it with good design principles in mind. You can start to grow areas of very good high-quality code in legacy code bases, but don't be surprised if some of the steps you take to make changes involve making some code slightly uglier. This work is like surgery. We have to make incisions, and we have to move through the guts and suspend some aesthetic judgment. Could this patient's major organs and viscera be better than they are? Yes. So do we just forget about his immediate problem, sew him up again, and tell him to eat right and train for a marathon? We could, but what we really need to do is take the patient as he is, fix what's wrong, and move him to a healthier state. He might never become an Olympic athlete, but we can't let "best" be the enemy of "better." Code bases can become healthier and easier to work in. When a patient feels a little better, often that is the time when you can help him make commitments to a healthier life style. That is what we are shooting for with legacy code. We are trying to get to the point at which we are used to ease; we expect it and actively attempt to make code change easier. When we can sustain that sense on a team, design gets better.
The techniques I describe are ones that I've discovered and learned with coworkers and clients over the course of years working with clients to try to establish control over unruly code bases. I got into this legacy code emphasis accidentally. When I first started working with Object Mentor, the bulk of my work involved helping teams with serious problems develop their skills and interactions to the point that they could regularly deliver quality code. We often used Extreme Programming practices to help teams take control of their work, collaborate intensively, and deliver. I often feel that Extreme Programming is less a way to develop software than it is a way to make a well-jelled work team that just happens to deliver great software every two weeks.
From the beginning, though, there was a problem. Many of the first XP projects were "greenfield" projects. The clients I was seeing had significantly large code bases, and they were in trouble. They needed some way to get control of their work and start to deliver. Over time, I found that I was doing the same things over and over again with clients. This sense culminated in some work I was doing with a team in the financial industry. Before I'd arrived, they'd realized that unit testing was a great thing, but the tests that they were executing were full scenario tests that made multiple trips to a database and exercised large chunks of code. The tests were hard to write, and the team didn't run them very often because they took so long to run. As I sat down with them to break dependencies and get smaller chunks of code under test, I had a terrible sense of deasjaag vu. It seemed that I was doing this sort of work with every team I met, and it was the sort of thing that no one really wanted to think about. It was just the grunge work that you do when you want to start working with your code in a controlled way, if you know how to do it. I decided then that it was worth really reflecting on how we were solving these problems and writing them down so that teams could get a leg up and start to make their code bases easier to live in.
A note about the examples: I've used examples in several different programming languages. The bulk of the examples are written in Java, C++, and C. I picked Java because it is a very common language, and I included C++ because it presents some special challenges in a legacy environment. I picked C because it highlights many of the problems that come up in procedural legacy code. Among them, these languages cover much of the spectrum of concerns that arise in legacy code. However, if the languages you using are not covered in the examples, take a look at them anyway. Many of the techniques that I cover can be used in other languages, such as Delphi, Visual Basic, COBOL, and FORTRAN.
I hope that you find the techniques in this book helpful and that they allow you to get back to what is fun about programming. Programming can be very rewarding and enjoyable work. If you don't feel that in your day-to-day work, I hope that the techniques I offer you in this book help you find it and grow it on your team.Acknowledgements
First of all, I owe a serious debt to my wife, Ann, and my children, Deborah and Ryan. Their love and support made this book and all of the learning that preceded it possible. I'd also like to thank "Uncle Bob" Martin, president and founder of Object Mentor. His rigorous pragmatic approach to development and design, separating the critical from the inconsequential, gave me something to latch upon about 10 years ago, back when it seemed that I was about to drown in a wave of unrealistic advice. And thanks, Bob, for giving me the opportunity to see more code and work with more people over the past five years than I ever imagined possible.
I also have to thank Kent Beck, Ron Jeffries, and Ward Cunningham for offering me advice at times and teaching me a great deal about team work, design, and programming. Special thanks to all of the people who reviewed the drafts. The official reviewers were Sven Gorts, Robert C. Martin, Erik Meade, and Bill Wake; the unofficial reviewers were Dr. Robert Koss, James Grenning, Lowell Lindstrom, Micah Martin, Russ Rufer and the Silicon Valley Patterns Group, and James Newkirk.
Thanks also to reviewers of the very early drafts I placed on the Internet. Their feedback significantly affected the direction of the book after I reorganized its format. I apologize in advance to any of you I may have left out. The early reviewers were: Darren Hobbs, Martin Lippert, Keith Nicholas, Phlip Plumlee, C. Keith Ray, Robert Blum, Bill Burris, William Caputo, Brian Marick, Steve Freeman, David Putman, Emily Bache, Dave Astels, Russel Hill, Christian Sepulveda, and Brian Christopher Robinson.
Thanks also to Joshua Kerievsky who gave a key early review and Jeff Langr who helped with advice and spot reviews all through the process.
The reviewers helped me polish the draft considerably, but if there are errors remaining, they are solely mine.
Thanks to Martin Fowler, Ralph Johnson, Bill Opdyke, Don Roberts, and John Brant for their work in the area of refactoring. It has been inspirational.
I also owe a special debt to Jay Packlick, Jacques Morel, and Kelly Mower of Sabre Holdings, and Graham Wright of Workshare Technology for their support and feedback.
Special thanks also to Paul Petralia, Michelle Vincenti, Lori Lyons, Krista Hansing, and the rest of the team at Prentice-Hall. Thank you, Paul, for all of the help and encouragement that this first-time author needed.
Special thanks also to Gary and Joan Feathers, April Roberts, Dr. Raimund Ege, David Lopez de Quintana, Carlos Perez, Carlos M. Rodriguez, and the late Dr. John C. Comfort for help and encouragement over the years.I also have to thank to Brian Button for the example in Chapter XX, I'm Changing the Same Code All Over the Place. He wrote that code in abot an hour when we were developing a refactoring course together, and it's become my favorite piece of teaching code.
Also, special thanks to Janik Top, whose instrumental De Futura served as the soundtrack for my last few weeks of work on this book.
Finally, I'd like to thank everyone whom I've worked with over the past few years whose insights and challenges strengthened the material in this book.
© Copyright Pearson Education. All rights reserved.
Most Helpful Customer Reviews
Feathers confronts a depressingly familiar problem encountered by many programmers. How to maintain a system of legacy code? Where often there has been no rigorous attempt to test it. Even in a manual fashion. He shows ways to build a test harness to automatically test the code. Even if this does not perform an exhaustive test, it may still be far in advance of what you already have (nothing?) to validate the code. He writes assuming that you might never have met a disciplined testing approach. So unit testing is carefully explained and he builds from there. Simple, useful patterns like Decorator are described. The book is not meant as a comprehensive exposition of patterns. But hopefully, you can see the general idea of patterns and its utility. Large portions of the book are essentially about refactoring legacy code into these patterns, if possible. And also about testing your changes in a systematic way. If you do the former, you should also do the latter. The examples are mostly in C++ and Java. But that's neither here nor there, if you program in other languages like C# or C. The ideas from the examples carry over well.
Great book, but get's kinda tedoius. Abstract interface, abstract implementation, so on and so on. However, I consider it to be a must read for programmers who don't work at startups (ie inevitably deal w/ legacy code).
This book provides some great insight into writing testable code even if you are not dealing with a "legacy" code base. His style is great and it really is an interesting and comfortable read.