Linux Troubleshooting for System Administrators and Power Users

Overview

Linux is a fast-growing operating system with power and appeal, and enterprises worldwide are quickly adopting the system to utilize its benefits. But as with all operating systems, performance problems do occur causing system administrators to scramble into action. Finally, there is a complete reference for troubleshooting Linux–quickly! Linux Troubleshooting for System Administrators and Power Users is THE book for locating and solving problems and maintaining high performance...

See more details below
Other sellers (Paperback)
  • All (12) from $3.79   
  • New (4) from $38.15   
  • Used (8) from $3.79   
Linux Troubleshooting for System Administrators and Power Users

Available on NOOK devices and apps  
  • NOOK Devices
  • NOOK HD/HD+ Tablet
  • NOOK
  • NOOK Color
  • NOOK Tablet
  • Tablet/Phone
  • NOOK for Windows 8 Tablet
  • NOOK for iOS
  • NOOK for Android
  • NOOK Kids for iPad
  • PC/Mac
  • NOOK for Windows 8
  • NOOK for PC
  • NOOK for Mac
  • NOOK Study

Want a NOOK? Explore Now

NOOK Book (eBook)
$25.49
BN.com price
(Save 42%)$43.99 List Price

Overview

Linux is a fast-growing operating system with power and appeal, and enterprises worldwide are quickly adopting the system to utilize its benefits. But as with all operating systems, performance problems do occur causing system administrators to scramble into action. Finally, there is a complete reference for troubleshooting Linux–quickly! Linux Troubleshooting for System Administrators and Power Users is THE book for locating and solving problems and maintaining high performance in Red Hat® Linux and Novell® SUSE® Linux systems.

This book not only teaches you how to troubleshoot Linux, it shows you how the system works–so you can attack any problem at its root. Should you reinstall if Linux does not boot? Or can you save time by troubleshooting the problem? Can you enhance performance when Linux hangs or runs slowly? Can you overcome problems with printing or accessing a network? This book provides easy-to-follow examples and an extensive look at the tools, commands, and scripts that make Linux run properly.

  • A troubleshooting guide for all Linux users: Focuses on common problems with start-up, printing, login, the network, security, and more
  • Restore Linux when boot, startup, or shutdown fails–and reinstall Linux properly when all troubleshooting fails
  • Explains how to use some of the most popular Linux performance tools, including top, sar, vmstat, iostat, and free
  • Handle storage problems and CPU slamming to ensure high Linux performance
  • Solve hardware device problems by deciphering error messages and using the lspci tool
  • Use backup/recover commands and tape libraries to create proper backups
  • Identify and correct remote and network printing problems using spooler commands

Gone are the days of searching online for solutions that are out of date and unreliable. Whether you are a system admin, developer, or user, this book is an invaluable resource for ensuring that Linux runs smoothly, efficiently, and securely.

Read More Show Less

Product Details

  • ISBN-13: 9780131855151
  • Publisher: Prentice Hall
  • Publication date: 5/5/2006
  • Series: HP Professional Series
  • Pages: 624
  • Product dimensions: 6.98 (w) x 9.18 (h) x 1.08 (d)

Meet the Author

James Kirkland is a Senior Consultant for Racemi. He was previously a Senior Systems Administrator at Hewlett-Packard. He has been working with UNIX variants for more than ten years. James is a Red Hat Certified engineer, Linux LPIC level one certified, and an HP-UX certified System Administrator. He has been working with Linux for seven years and HP-UX for eight years. He has been a participant at HP World, Linux World, and numerous internal HP forums.

David Carmichael works for Hewlett-Packard as a Technical Problem Manager in Alpharetta, Georgia. He earned a bachelors degree in computer science from West Virginia University in 1987 and has been helping customers resolve their IT problems ever since. David has written articles for HP’s IT Resource Center (http://itrc.hp.com) and presented at HP World 2003.

Chris Tinker and Greg Tinker are twin brothers originally from LaFayette, Georgia. Chris began his career in computers while working as a UNIX System Administrator for Lockheed Martin in Marietta, Georgia. Greg began his career while at Bellsouth in Atlanta, Georgia. Both Chris and Greg joined Hewlett-Packard in 1999. Chris’s primary role at HP is as a Senior Software Business Recovery Specialist and Greg’s primary role is as a Storage Business Recovery Specialist. Both Chris and Greg have participated in HP World, taught several classes in UNIX/Linux and Disk Array technology, and obtained various certifications including certifications in Advanced Clusters, SAN, and Linux. Chris resides with his wife, Bonnie, and Greg resides with his wife, Kristen, in Alpharetta, Georgia.

Read More Show Less

Read an Excerpt

PrefacePreface

My good friend, James Kirkland, sent me an instant message one day asking if I wanted to write a Linux troubleshooting book with him. James has been heavily involved in Linux at the HP Response Center for several years. While troubleshooting Linux issues for customers, he realized there was not a good troubleshooting reference available. I remember a meeting discussing Linux troubleshooting. Someone asked what the most valuable Linux troubleshooting tool was. The answer was immediate. Google. If you have ever spent time trying to find a solution for a Linux problem, you know what that engineer was talking about. A wealth of great Linux information can be found on the Internet, but you can't always rely on this strategy. Some of the Linux information is outdated. A lot of it can't be understood without a good foundation of subject knowledge, and some of it is incorrect. We wanted to write this book so the Linux administrator will know how Linux works and how to approach and resolve common issues. This book contains the information we wish we had when we started troubleshooting Linux.

Greg and Chris are identical twins and serious Linux hobbyists. They have been Linux advocates within HP for years. Yes, they both run Linux on their laptops. Chris is a member of the Superdome Server team (http://www.hp.com/products1/servers/scalableservers/superdome/index.html). Greg works for the XP storage team (http://h18006.www1.hp.com/storage/xparrays.html). Their Linux knowledge is wide and deep. They have worked through SAN storage issues and troubleshot process hangs, Linux crashes, performance issues, and everything else for our customers, and theyhave put their experience into the book.

I am a member of the HP escalations team. I've primarily spent my time resolving HPUX issues. I've been a Linux hobbyist for a few years, and I've started working Linux escalations, but I'm definitely newer to Linux than the rest of the team. I try to give the book the perspective of someone who is fairly new to Linux. I tried to remember the questions I had when I first started troubleshooting Linux issues and included them in the book. We sincerely hope our effort is helpful to you.

—Dave CarmichaelChapter Summaries

These chapter summaries will give you an idea of how the book is organized and a bit of an overview of the content of each chapter. Chapter 1: System Boot, Startup, and Shutdown Issues

Chapter 1 discusses the different subsystems that comprise Linux startup. These include the bootloaders GRUB and LILO, the init process, and the rc startup and shutdown scripts. We explain how GRUB and LILO work along with the important features of each. The reader will learn how to boot when there are problems with the bootloader. There are numerous examples. We explain how init works and what part it plays in starting Linux. The rc scripts are explained in detail as well. The reader will learn how to boot to single user mode, emergency mode, and confirm mode. Examples are included of using a recovery CD when Linux won't boot from disk. Chapter 2: System Hangs and Panics

This chapter explains interruptible and non-interruptible OS hangs, kernel panics, and IA64 hardware machine checks. A Linux hang takes one of two forms. An interruptible hang is when Linux seems frozen but does respond to some events, such as a ping request. Non-interruptible hangs do not respond to any actions. We show how to use the Magic SysReq keystroke to generate a stack trace to troubleshoot an interruptible hang. We explain how to force a panic when Linux is in a non-interruptible hang. An OS panic is a voluntary shutdown of the kernel in response to something unexpected. We discuss how to obtain a panic dump from Linux. The IA64 architecture dump mechanism is also explained. Chapter 3: Performance Tools

In Chapter 3, we explain how to use some of the most popular Linux performance tools including top, sar, vmstat, iostat, and free. The examples show common syntaxes and options. Every system administrator should be familiar with these commands. Chapter 4: Performance

Chapter 4 discusses different approaches to isolating a performance problem. As with the majority of performance issues, storage always seems to draw significant attention. The goal of this chapter is to provide a quick understanding of how a storage device should perform and easy ways to get a performance measurement without expensive software. In addition to troubleshooting storage performance, we touch on CPU bottlenecks and ways to find such events. Chapter 5: Adding New Storage via SAN with Reference to PCMCIA and USB

Linux is moving out from under the desk and into the data center. An essential feature of an enterprise computing platform is being able to access storage on the SAN. This chapter provides a detailed walkthrough and examples of installing and configuring Fibre Channel cards. We discuss driver issues, how the device files work, and how to add LUNs. Chapter 6: Disk Partitions and Filesystems

Master Boot Record (MBR) basics are explained, and examples are shown detailing how bootloader programs such as LILO and GRUB manipulate the MBR. We explain the partition table, and a lot of examples are given so that the reader will understand how the disk is carved up into extended and logical partitions. Many scenarios are provided explaining common disk and filesystem problems and their solutions. After reading this chapter, the reader will understand not only what MBA, LBA, extended partitions, and all the other buzzwords mean, but also how they look on the disk and how to fix problems related to them.Chapter 7: Device Failure and Replacement

This chapter explains identifying problems with hardware devices and how to fix them. We begin with a discussion of supported devices. Whether a device is supported by the Linux distribution is a good thing to know before spending a lot of time trying to get it working. Next we show where to look for indications of hardware problems. The reader will learn how to decipher the hexadecimal error messages from dmesg and syslog. We explain how to use the lspci tool for troubleshooting. When the error is understood, the next goal is to resolve the device problem. We demonstrate techniques for determining what needs to be done to fix device issues including SAN devices. Chapter 8: Linux Processes: Structure, Hangs, and Core Dumps

Process management is the heart of the Linux kernel. A system administrator should know what happens when a process is created to troubleshoot process issues. This chapter explains process creation and provides a foundation for troubleshooting. Linux is a multithreading kernel. The reader will learn how multithreading works and what heavyweight and lightweight processes are. The reader also will learn how to troubleshoot a process that seems to be hanging and not doing any work. Core dumps are also covered. We show you how to learn which process dumped core and why. This chapter details how cores are created and how to best utilize them to understand the problem. Chapter 9: Backup/Recovery

Creating good backups is one of if not the most important tasks a system administrator must perform. This chapter explains the most commonly used backup/recovery commands: tar, cpio, dump/restore, and so on. Tape libraries (autoloaders) are explained along with the commands needed to manipulate them. The reader will learn the uses of different tape device files. There are examples showing how to troubleshoot common issues. Chapter 10: cron and at

The cron and at commands are familiar to most Linux users. These commands are used to schedule jobs to run at a later time. This chapter explains how the cron/at subsystem works and where to look when jobs don't run. The cron, at, batch, and anacron facilities are explained in detail. The kcron graphical cron interface is discussed. Numerous examples are provided to demonstrate how to resolve the most common problems. The troubleshooting techniques help build good general troubleshooting skills that can be applied to many other Linux problems. Chapter 11: Printing and Printers

This chapter explains the different print spoolers used in Linux systems. The reader will learn how the spooler works. The examples show how to use the spooler commands such as lpadmin, lpoption, lprm, and others to identify problems. The different page description languages such as PCL and PostScript are explained. Examples demonstrate how to fix remote printing and network printing problems. Chapter 12: System Security

Security is a concern of every system administrator. Is the box safe because it is behind a firewall? What steps should be taken to secure my system? These questions are answered. Host-based and network-based security are explained. Secure Shell protocol (SSH) is covered in detail: why SSH is secure, encryption with SSH, SSH tunnels, troubleshooting typical SSH problems, and SSH examples are provided. The reader will learn system hardening using netfilter and iptables. Netfilter and iptables together make up the standard firewall software for the Linux 2.4 and 2.6 kernels. Chapter 13: Network Problems

Network issues are a common problem for any system administrator. What should be done when Linux boots and users can't connect? Is the problem with the Linux box or something on the LAN? Has the network interface card failed? We need a systematic way to verify the network hardware and Linux configuration. Chapter 13 provides the information a Linux system administrator needs to troubleshoot network problems. Learn where to look for configuration problems and how to use the commands ethtool, modinfo, mii, and others to diagnose networking problems. Chapter 14: Login Problems

Chapter 14 explains how the login process works and how to troubleshoot login failures. Password aging is explained. Several examples show the reader how to fix common login problems. The Pluggable Authentication Modules (PAM) subsystem is explained in detail. The examples reinforce the concepts explained and demonstrate how to fix problems encountered with PAM. Chapter 15: X Windows Problems

GNOME and KDE are client/server applications just like many others that run on Linux, but they can be frustrating to troubleshoot because they are display managers. After reading this chapter, the reader will understand the components of Linux graphical display managers and how to troubleshoot problems. Practical examples are provided to reinforce the concepts, and they can be applied to real-world problems.

© Copyright Pearson Education. All rights reserved.

Read More Show Less

Table of Contents

Preface xvii

Chapter 1 System Boot, Startup, and Shutdown Issues 1

Chapter 2 System Hangs and Panics 51

Chapter 3 Performance Tools 79

Chapter 4 Performance 107

Chapter 5 Adding New Storage via SAN with Reference to PCMCIA and USB 159

Chapter 6 Disk Partitions and Filesystems 185

Chapter 7 Device Failure and Replacement 229

Chapter 8 Linux Processes: Structures, Hangs, and Core Dumps 253

Chapter 9 Backup/Recovery 285

Chapter 10 cron and at 315

Chapter 11 Printing and Printers 345

Chapter 12 System Security 383

Chapter 13 Network Problems 423

Chapter 14 Login Problems 495

Chapter 15 X Windows Problems 527

Index 551

Read More Show Less

Preface

Preface

My good friend, James Kirkland, sent me an instant message one day asking if I wanted to write a Linux troubleshooting book with him. James has been heavily involved in Linux at the HP Response Center for several years. While troubleshooting Linux issues for customers, he realized there was not a good troubleshooting reference available. I remember a meeting discussing Linux troubleshooting. Someone asked what the most valuable Linux troubleshooting tool was. The answer was immediate. Google. If you have ever spent time trying to find a solution for a Linux problem, you know what that engineer was talking about. A wealth of great Linux information can be found on the Internet, but you can't always rely on this strategy. Some of the Linux information is outdated. A lot of it can't be understood without a good foundation of subject knowledge, and some of it is incorrect. We wanted to write this book so the Linux administrator will know how Linux works and how to approach and resolve common issues. This book contains the information we wish we had when we started troubleshooting Linux.

Greg and Chris are identical twins and serious Linux hobbyists. They have been Linux advocates within HP for years. Yes, they both run Linux on their laptops. Chris is a member of the Superdome Server team (http://www.hp.com/products1/servers/scalableservers/superdome/index.html). Greg works for the XP storage team (http://h18006.www1.hp.com/storage/xparrays.html). Their Linux knowledge is wide and deep. They have worked through SAN storage issues and troubleshot process hangs, Linux crashes, performance issues, and everything else for our customers, and they have put their experience into the book.

I am a member of the HP escalations team. I've primarily spent my time resolving HPUX issues. I've been a Linux hobbyist for a few years, and I've started working Linux escalations, but I'm definitely newer to Linux than the rest of the team. I try to give the book the perspective of someone who is fairly new to Linux. I tried to remember the questions I had when I first started troubleshooting Linux issues and included them in the book. We sincerely hope our effort is helpful to you.

—Dave Carmichael

Chapter Summaries

These chapter summaries will give you an idea of how the book is organized and a bit of an overview of the content of each chapter.

Chapter 1: System Boot, Startup, and Shutdown Issues

Chapter 1 discusses the different subsystems that comprise Linux startup. These include the bootloaders GRUB and LILO, the init process, and the rc startup and shutdown scripts. We explain how GRUB and LILO work along with the important features of each. The reader will learn how to boot when there are problems with the bootloader. There are numerous examples. We explain how init works and what part it plays in starting Linux. The rc scripts are explained in detail as well. The reader will learn how to boot to single user mode, emergency mode, and confirm mode. Examples are included of using a recovery CD when Linux won't boot from disk.

Chapter 2: System Hangs and Panics

This chapter explains interruptible and non-interruptible OS hangs, kernel panics, and IA64 hardware machine checks. A Linux hang takes one of two forms. An interruptible hang is when Linux seems frozen but does respond to some events, such as a ping request. Non-interruptible hangs do not respond to any actions. We show how to use the Magic SysReq keystroke to generate a stack trace to troubleshoot an interruptible hang. We explain how to force a panic when Linux is in a non-interruptible hang. An OS panic is a voluntary shutdown of the kernel in response to something unexpected. We discuss how to obtain a panic dump from Linux. The IA64 architecture dump mechanism is also explained.

Chapter 3: Performance Tools

In Chapter 3, we explain how to use some of the most popular Linux performance tools including top, sar, vmstat, iostat, and free. The examples show common syntaxes and options. Every system administrator should be familiar with these commands.

Chapter 4: Performance

Chapter 4 discusses different approaches to isolating a performance problem. As with the majority of performance issues, storage always seems to draw significant attention. The goal of this chapter is to provide a quick understanding of how a storage device should perform and easy ways to get a performance measurement without expensive software. In addition to troubleshooting storage performance, we touch on CPU bottlenecks and ways to find such events.

Chapter 5: Adding New Storage via SAN with Reference to PCMCIA and USB

Linux is moving out from under the desk and into the data center. An essential feature of an enterprise computing platform is being able to access storage on the SAN. This chapter provides a detailed walkthrough and examples of installing and configuring Fibre Channel cards. We discuss driver issues, how the device files work, and how to add LUNs.

Chapter 6: Disk Partitions and Filesystems

Master Boot Record (MBR) basics are explained, and examples are shown detailing how bootloader programs such as LILO and GRUB manipulate the MBR. We explain the partition table, and a lot of examples are given so that the reader will understand how the disk is carved up into extended and logical partitions. Many scenarios are provided explaining common disk and filesystem problems and their solutions. After reading this chapter, the reader will understand not only what MBA, LBA, extended partitions, and all the other buzzwords mean, but also how they look on the disk and how to fix problems related to them.

Chapter 7: Device Failure and Replacement

This chapter explains identifying problems with hardware devices and how to fix them. We begin with a discussion of supported devices. Whether a device is supported by the Linux distribution is a good thing to know before spending a lot of time trying to get it working. Next we show where to look for indications of hardware problems. The reader will learn how to decipher the hexadecimal error messages from dmesg and syslog. We explain how to use the lspci tool for troubleshooting. When the error is understood, the next goal is to resolve the device problem. We demonstrate techniques for determining what needs to be done to fix device issues including SAN devices.

Chapter 8: Linux Processes: Structure, Hangs, and Core Dumps

Process management is the heart of the Linux kernel. A system administrator should know what happens when a process is created to troubleshoot process issues. This chapter explains process creation and provides a foundation for troubleshooting. Linux is a multithreading kernel. The reader will learn how multithreading works and what heavyweight and lightweight processes are. The reader also will learn how to troubleshoot a process that seems to be hanging and not doing any work. Core dumps are also covered. We show you how to learn which process dumped core and why. This chapter details how cores are created and how to best utilize them to understand the problem.

Chapter 9: Backup/Recovery

Creating good backups is one of if not the most important tasks a system administrator must perform. This chapter explains the most commonly used backup/recovery commands: tar, cpio, dump/restore, and so on. Tape libraries (autoloaders) are explained along with the commands needed to manipulate them. The reader will learn the uses of different tape device files. There are examples showing how to troubleshoot common issues.

Chapter 10: cron and at

The cron and at commands are familiar to most Linux users. These commands are used to schedule jobs to run at a later time. This chapter explains how the cron/at subsystem works and where to look when jobs don't run. The cron, at, batch, and anacron facilities are explained in detail. The kcron graphical cron interface is discussed. Numerous examples are provided to demonstrate how to resolve the most common problems. The troubleshooting techniques help build good general troubleshooting skills that can be applied to many other Linux problems.

Chapter 11: Printing and Printers

This chapter explains the different print spoolers used in Linux systems. The reader will learn how the spooler works. The examples show how to use the spooler commands such as lpadmin, lpoption, lprm, and others to identify problems. The different page description languages such as PCL and PostScript are explained. Examples demonstrate how to fix remote printing and network printing problems.

Chapter 12: System Security

Security is a concern of every system administrator. Is the box safe because it is behind a firewall? What steps should be taken to secure my system? These questions are answered. Host-based and network-based security are explained. Secure Shell protocol (SSH) is covered in detail: why SSH is secure, encryption with SSH, SSH tunnels, troubleshooting typical SSH problems, and SSH examples are provided. The reader will learn system hardening using netfilter and iptables. Netfilter and iptables together make up the standard firewall software for the Linux 2.4 and 2.6 kernels.

Chapter 13: Network Problems

Network issues are a common problem for any system administrator. What should be done when Linux boots and users can't connect? Is the problem with the Linux box or something on the LAN? Has the network interface card failed? We need a systematic way to verify the network hardware and Linux configuration. Chapter 13 provides the information a Linux system administrator needs to troubleshoot network problems. Learn where to look for configuration problems and how to use the commands ethtool, modinfo, mii, and others to diagnose networking problems.

Chapter 14: Login Problems

Chapter 14 explains how the login process works and how to troubleshoot login failures. Password aging is explained. Several examples show the reader how to fix common login problems. The Pluggable Authentication Modules (PAM) subsystem is explained in detail. The examples reinforce the concepts explained and demonstrate how to fix problems encountered with PAM.

Chapter 15: X Windows Problems

GNOME and KDE are client/server applications just like many others that run on Linux, but they can be frustrating to troubleshoot because they are display managers. After reading this chapter, the reader will understand the components of Linux graphical display managers and how to troubleshoot problems. Practical examples are provided to reinforce the concepts, and they can be applied to real-world problems.

© Copyright Pearson Education. All rights reserved.

Read More Show Less

Customer Reviews

Be the first to write a review
( 0 )
Rating Distribution

5 Star

(0)

4 Star

(0)

3 Star

(0)

2 Star

(0)

1 Star

(0)

Your Rating:

Your Name: Create a Pen Name or

Barnes & Noble.com Review Rules

Our reader reviews allow you to share your comments on titles you liked, or didn't, with others. By submitting an online review, you are representing to Barnes & Noble.com that all information contained in your review is original and accurate in all respects, and that the submission of such content by you and the posting of such content by Barnes & Noble.com does not and will not violate the rights of any third party. Please follow the rules below to help ensure that your review can be posted.

Reviews by Our Customers Under the Age of 13

We highly value and respect everyone's opinion concerning the titles we offer. However, we cannot allow persons under the age of 13 to have accounts at BN.com or to post customer reviews. Please see our Terms of Use for more details.

What to exclude from your review:

Please do not write about reviews, commentary, or information posted on the product page. If you see any errors in the information on the product page, please send us an email.

Reviews should not contain any of the following:

  • - HTML tags, profanity, obscenities, vulgarities, or comments that defame anyone
  • - Time-sensitive information such as tour dates, signings, lectures, etc.
  • - Single-word reviews. Other people will read your review to discover why you liked or didn't like the title. Be descriptive.
  • - Comments focusing on the author or that may ruin the ending for others
  • - Phone numbers, addresses, URLs
  • - Pricing and availability information or alternative ordering information
  • - Advertisements or commercial solicitation

Reminder:

  • - By submitting a review, you grant to Barnes & Noble.com and its sublicensees the royalty-free, perpetual, irrevocable right and license to use the review in accordance with the Barnes & Noble.com Terms of Use.
  • - Barnes & Noble.com reserves the right not to post any review -- particularly those that do not follow the terms and conditions of these Rules. Barnes & Noble.com also reserves the right to remove any review at any time without notice.
  • - See Terms of Use for other conditions and disclaimers.
Search for Products You'd Like to Recommend

Recommend other products that relate to your review. Just search for them below and share!

Create a Pen Name

Your Pen Name is your unique identity on BN.com. It will appear on the reviews you write and other website activities. Your Pen Name cannot be edited, changed or deleted once submitted.

 
Your Pen Name can be any combination of alphanumeric characters (plus - and _), and must be at least two characters long.

Continue Anonymously

    If you find inappropriate content, please report it to Barnes & Noble
    Why is this product inappropriate?
    Comments (optional)