SharePoint Server 2010 Enterprise Content Management [NOOK Book]

Overview

Learn to use SharePoint Server 2010 as a robust platform for ECM

Within the pages of this book, a team of SharePoint Server and Enterprise Content Management (ECM) authorities takes you on a journey that examines the history of ECM, the capabilities that SharePoint Server 2010 offers for ECM, and the high-level ECM pillars that exist within SharePoint Server 2010. Once you have a strong foothold on what ECM is, the authors segue to in-depth coverage of document management and ...

See more details below
SharePoint Server 2010 Enterprise Content Management

Available on NOOK devices and apps  
  • NOOK Devices
  • Samsung Galaxy Tab 4 NOOK 7.0
  • Samsung Galaxy Tab 4 NOOK 10.1
  • NOOK HD Tablet
  • NOOK HD+ Tablet
  • NOOK eReaders
  • NOOK Color
  • NOOK Tablet
  • Tablet/Phone
  • NOOK for Windows 8 Tablet
  • NOOK for iOS
  • NOOK for Android
  • NOOK Kids for iPad
  • PC/Mac
  • NOOK for Windows 8
  • NOOK for PC
  • NOOK for Mac
  • NOOK for Web

Want a NOOK? Explore Now

NOOK Book (eBook)
$28.49
BN.com price
(Save 43%)$49.99 List Price
Note: This NOOK Book can be purchased in bulk. Please email us for more information.

Overview

Learn to use SharePoint Server 2010 as a robust platform for ECM

Within the pages of this book, a team of SharePoint Server and Enterprise Content Management (ECM) authorities takes you on a journey that examines the history of ECM, the capabilities that SharePoint Server 2010 offers for ECM, and the high-level ECM pillars that exist within SharePoint Server 2010. Once you have a strong foothold on what ECM is, the authors segue to in-depth coverage of document management and explorations of SharePoint workflow, collaboration, and search concepts as they apply to SharePoint as an ECM platform. Sparing no detail on a wide range of topics, this valuable resource boasts an abundance of helpful code samples and real-world examples to strengthen your learning experience.

SharePoint Server 2010 Enterprise Content Management:

  • Discusses the features of document management from a user perspective, an administrative perspective, and a technical perspective

  • Looks at the SharePoint web content management solution as well as how SharePoint provides support for records management

  • Covers electronic forms management from an ECM angle

  • Provides guidance for architecting a scalable platform

  • Reviews the most common document formats

  • Explores various techniques and protocols for importing content into SharePoint

wrox.com

Programmer Forums

Join our Programmer to Programmer forums to ask and answer programming questions about this book, join discussions on the hottest topics in the industry, and connect with fellow programmers from around the world.

Code Downloads

Take advantage of free code samples from this book, as well as code samples from hundreds of other books, all ready to use.

Read More

Find articles, ebooks, sample chapters, and tables of contents for hundreds of books, and more reference resources on programming topics that matter to you.

Wrox Professional guides are written by working programmers to meet the real-world needs of programmers, developers, and IT professionals. Focused and relevant, they address the issues technology professionals face every day. They provide examples, practical solutions, and expert education in new technologies, all designed to help programmers do a better job.

Read More Show Less

Product Details

  • ISBN-13: 9781118167311
  • Publisher: Wiley
  • Publication date: 8/24/2011
  • Sold by: Barnes & Noble
  • Format: eBook
  • Edition number: 1
  • Pages: 480
  • File size: 29 MB
  • Note: This product may take a few minutes to download.

Meet the Author

Todd Kitta is currently employed at KnowledgeLake, Inc., a SharePoint ISV specializing in document imaging and Enterprise Content Management on the Microsoft SharePoint platform.

Brett Grego is the Director of Engineering at KnowledgeLake, Inc.

Chris Caplinger is the CTO as well as one of the founders of KnowledgeLake, Inc.

Russ Houberg is a SharePoint Microsoft Certified Master and a senior architect at KnowledgeLake, Inc.

Read More Show Less

Read an Excerpt

SharePoint Server 2010 Enterprise Content Management


By Todd Kitta Brett Grego Chris Caplinger Russ Houberg

John Wiley & Sons

Copyright © 2011 John Wiley & Sons, Ltd
All right reserved.

ISBN: 978-0-470-58465-1


Chapter One

What Is Enterprise Content Management?

WHAT'S IN THIS CHAPTER?

* Defining ECM as used by this book

* Gaining a historical perspective of ECM

* Defining the components of an ECM system

Considering that this is a book both by and for architects and developers, devoting an entire chapter to talking about the enterprise content management (ECM) industry and trying to define it, rather than just jumping into the bits and bytes that you probably bought the book for, might seem strange. However, by introducing ECM as part of an industry, instead of describing how the SharePoint world perceives it, we hope to provide a perspective that wouldn't otherwise be possible if you make your living inside the SharePoint ecosystem.

ECM, within or outside of the SharePoint world, seems to be a much-abused abbreviation used to describe a variety of different technologies. Of course, people often adopt new or existing terms, applying their own twist to the original meaning, and this is certainly the case with ECM. The difficult part is determining which meaning is actually correct. Sometimes even the words representing the initials are changed. For example, in the halls of our own company, sometimes "electronic" is used instead of "enterprise." In other cases, ECM is confused with specific technologies that are part of it, such as DMS (Document Management System), IMS (Image Management System) or WCM (Web Content Management).

Clearly, ECM means a lot of different things to a variety of people. There is no doubt that some readers of this book will think something is missing from the definition, while other readers will find something included that does not fall into their own definition. That being said, this chapter introduces ECM not necessarily from a SharePoint perspective, but from a historical perspective; then it provides an overview of the components of an ECM system. You can skip this information, but we believe it is important to clarify the problems we are trying to solve, rather than just write code based on our own assumptions.

INTRODUCTION TO ECM

The "content" aspect of enterprise content management can refer to all kinds of sources, including electronic documents, scanned images, e-mail, and web pages.

This book uses the definition of ECM from the Association for Information and Image Management (AIIM) International, which can be found on their website at www.aiim.org:

Enterprise Content Management (ECM) is the strategies, methods, and tools used to capture, manage, store, preserve, and deliver content and documents related to organizational processes. ECM tools and strategies allow the management of an organization's unstructured information, wherever that information exists.

As this definition states, ECM is not really a noun. That is, it's not something as simple as an e-mail system or a device like a scanner, but rather an entire industry for capturing and managing just about any type of content. The key to the definition is that this content is related to organizational processes, which discounts information that is simply created but never used.

Moreover, ECM is meaningless without the tools that accompany it. You might say that the tools that solve your content problem also define it. This idea is explored in the next section, and hopefully clarified by a short history of a few of the technologies involved.

A HISTORICAL PERSPECTIVE

Although the term ECM is relatively new, many of the components that make up an ECM system started appearing in the 1970s. The world of information systems was vastly different 3040 years ago. The Internet as we know it did not exist, the cost to store data was astronomical compared to today, server processing power was a mere fraction of what it is today, and desktop computers didn't even exist.

The history of ECM can be traced back to several technologies that formed that first stored and managed electronic content: document imaging, electronic document management, computer output to laser disc (COLD), and of course workflow, which formed the business processes.

Document Imaging

As evidenced by the first systems to take the management and processing of documents seriously, paper was one of the first drivers. These systems were often referred to as electronic document management or document imaging systems. By scanning paper and storing it as electronic documents, organizations found a quick return on investment in several ways:

* It reduced the square footage needed to store paper.

* It resulted in faster execution of paper-based processes by electronic routing.

* It eliminated the time it took to reproduce lost documents.

* It reduced overhead because paper documents could be retrieved electronically.

In addition to a reduction in manpower, there were other benefits to storing paper electronically — namely, security and risk benefits, which preceded regulations such as Health Insurance Portability and Accountability Act (HIPAA) and Sarbanes-Oxley by more than a decade. Some of these included the following:

* Password protection of documents

* Enterprise security restraint brought about by secure networks

* Management of records needed for legal holds

* Management of the document life cycle, such as retention periods

* Audit information about the document life cycle and requests about the document

The first document imaging systems for commercial consumption became available in the early 1980s, and they quickly started to replace the previous technology for removing paper from organizations, which was microfiche. Billions of documents were stored on microfiche, but indexes and location data were often stored in databases. Conversions of these systems to document imaging are surely still being handled today.

The ability to scan existing paper documents in order to create electronic documents, as discussed in the next section, led to the vision of a "paperless office," a commonly used phrase by the end of the century. Of course, this lofty and often pursued goal of a paperless office has yet to materialize, and paper is still the original driver behind many business processes. As shown in Figure 1-1, focusing on paper is a good starting point to quickly begin realizing the benefits of an ECM system.

Electronic Documents

The invention of computer-based word processors (in the 1970s) created the need for a way to store and quickly retrieve these documents. Electronic documents share similarities with document imaging systems, yet they are unique in that they are typically dynamic; that is, they often require ongoing modification, whereas scanning paper was typically performed for archiving purposes.

The first electronic documents were created through word processing software, driven in the late 1970s by WordStar and Word Perfect. Although the former has been abandoned, WordPerfect still exists today and is part of an office suite from Corel.

Soon after personal computers and electronic word processors hit the market, electronic spreadsheets became available, beginning with VisiCalc, followed by Lotus 1-2-3 and eventually Microsoft Excel. Spreadsheet documents are now as commonplace as word processing documents.

Today, electronic documents exist in countless types and formats, ranging from simple ASCII text files to complex binary structures.

COLD/Enterprise Report Management

The widespread use of computers, beginning with the large mainframes, resulted in an unprecedented use of paper. Early computers all over the world started producing reports, typically on what is known as green bar paper. As the need for information from both mainframe and mini computers grew, so did the need for computer-generated reports. Necessary at first because structured methods for viewing data electronically did not exist, this excessive use of paper continued to plague organizations into the 1990s and even into this millennium.

Out of this problem grew a solution coined computer output to laser disc (COLD). Instead of generating paper, these reports could be handled in a type of electronic content management system, typically storing the ASCII data and rendering it onto monitors. These systems enabled not only search and retrieval of the reports, but the addition of annotations, and of course printing of the documents when using a monitor is not adequate.

The term COLD was eventually replaced by enterprise report management when magnetic storage replaced the early optical storage systems.

Business Process Management/Workflow

Storing content electronically was a great step forward, but moving content to a digital medium quickly put the content in front of the right user at the right time.

The first business process management (BPM) systems, launched in the mid 1980s and called workflow systems, were created by the same companies that brought document imaging systems to market. These early systems were far less complex than the BPM systems used today, however, as they primarily enabled content to be put into queues to be processed by the same workers that processed the paper.

It was almost 10 years later when the first graphical components became available for creating complex workflow maps. This was the beginning of BPM as we know it today, which enables organizations to create, store, and modify business processes.

ECM COMPONENTS

In order to understand ECM, it is necessary to understand the common components that comprise such as system. The following sections provide a brief overview of these components, which, like the definition for ECM provided earlier, have been defined by AIIM.

Capture

Capture is the process of gathering the data, regardless of the source, including classification, indexing (sometimes called tagging), and rendition. These tasks are required before storage is possible, in order to understand what type of content is being managed, which keywords will be used to search for the content, and to ensure that the content is in a form that can be easily retrieved and viewed later.

Paper

Paper is still the primary driver of ECM. The reason is simple: Because of the volume of paper that most companies need to handle, efficiently managing that paper can provide the greatest return on investment. Figure 1-2 gives you some idea of the cost of paper in an enterprise.

Paper capture is primarily done by using document scanners specifically built for the purpose. These scanners can capture both small and large batches of paper. Paper documents are typically divided into three categories: structured, semi-structured, and unstructured.

Structured documents typically represent forms such as tax documents, applications, or other preexisting forms. Because they are always the same, these documents are usually the easiest to automatically extract information from. Technology such as Zonal OCR can be easily applied to structured forms because the key data always exists in the same place.

Semi-structured documents are similar to structured documents, but they are different enough that structured zones can no longer be used to extract data. A common type of semi-structured document is the invoice. Most invoices are similar enough to be recognized as such, but each company designs its invoice with enough nuances to distinguish it from others, so these documents require either manual manipulation of the data or more intelligent automation.

Unstructured content represents the majority of the information in the average corporate enterprise. Almost all human correspondence is a good example of unstructured content. Although this content ends up being stored as the same content type in each company, on a per-page basis they have very little in common. Manual indexing is typically required for this type of data.

Developing applications to handle all the different types of paper that may come into an enterprise is a daunting task. Fortunately, toolkits are available to drive most document scanners on the market today, and these are normally compliant with either (or both) the TWAIN and ISIS driver standards. Although the physical process of scanning a piece of paper is simple, building a good process for either manually entering information or automatically extracting it is difficult, and probably not cost effective for custom applications.

Office Documents

Because this book is about ECM and (Microsoft) SharePoint, it focuses on the most common type of electronic documents that need to be managed: (Microsoft) Office documents. These documents are created using word processing software, spreadsheet software, presentation software, and so on. These documents are typically pre-classified on creation, as they frequently start from a template; therefore, extracting data for searching is often overlooked. Although each word from the document can be added to search indexers, it makes more sense to use specific keywords to identify the document. Sometimes pre-identified form fields are used, but often the most important data is keyed by hand before sending the document to storage.

E-mail

Capturing e-mails into an ECM system is becoming an increasingly common scenario. E-mail is often used to drive a business process, as it is becoming an acceptable form of correspondence in most organizations. Although some data can be indexed automatically, such as the sender, receiver, and subject, as shown in Figure 1-3, it can be difficult to extract the useful information contained in the body of the message, and manual intervention is usually required.

E-mail attachments are often more important than the e-mail message itself. Although Microsoft Outlook and other e-mail clients are improving ECM integration, extracting the attachment and exporting it to an ECM system usually requires specialized software in order to properly tag the content with searchable data.

Reports

As mentioned earlier, technology originally known as computer output to laser disc (COLD) and later as enterprise report management enables computer reports to be parsed into electronic files. Classifying and indexing these documents is typically automatic because they are in a form that is very structured.

Like the handling of paper, building a system for handling enterprise reports is most likely not cost effective when you compare the needed functionality versus the difficulty of obtaining it. Consider being able to read EDI or other electronic streams and extract the necessary data from them. Also, although data storage may not be an issue, you must consider how it will be displayed to users in a readable format.

Electronic Forms

It was initially believed that the goal of the paperless office would be achieved with the help of electronic forms. After all, the form templates provided in many applications, such as InfoPath, enable users to fill out preexisting fields and submit them directly to a content management system. With the type of content already known and the data being put into electronic form as it was gathered, it stood to reason that the paper forms could gradually disappear. However, human habits die hard. It may take another generation of computer users, who have been raised from birth with computers and who use them for everything from social networking to bill-paying, to realize the truly paperless office.

Other Sources

Although the most common types of capture have been identified here, there are many other possible data sources and data types. Multimedia, XML, and EDI are other well-known data formats that can arrive from many different sources. Indeed, just about any type of data can be consumed in an ECM system.

Store and Preserve

The store and preserve components of an ECM system are very similar; storage traditionally pertains to the temporary location of content, whereas preservation refers to long-term storage. In the past these were separated because online storage was costly. Content was usually stored in the temporary location only during its active life cycle, when it was frequently accessed as part of the business process. Once the active life cycle was complete, content would move to long-term storage, known as offline storage or nearline storage, which was much less expensive. The term "offline" reflects the fact that the content wasn't accessible without human intervention; the term "nearline" typically refers to optical discs that were brought online automatically, such as in the case of a jukebox. The following sections describe both the software and hardware components of storage and preservation of content.

(Continues...)



Excerpted from SharePoint Server 2010 Enterprise Content Management by Todd Kitta Brett Grego Chris Caplinger Russ Houberg Copyright © 2011 by John Wiley & Sons, Ltd. Excerpted by permission of John Wiley & Sons. All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

Read More Show Less

Table of Contents

INTRODUCTION xxix

PART I: INTRODUCTION TO ENTERPRISE CONTENT MANAGEMENT

CHAPTER 1: WHAT IS ENTERPRISE CONTENT MANAGEMENT? 3

Introduction to ECM 4

A Historical Perspective 4

Document Imaging 4

Electronic Documents 6

COLD/Enterprise Report Management 6

Business Process Management/Workfl ow 6

ECM Components 7

Capture 7

Paper 7

Office Documents 8

E-mail 8

Reports 9

Electronic Forms 9

Other Sources 9

Store and Preserve 9

Software 10

Hardware and Media Technologies 10

Cloud 11

Management Components 12

Document Management 12

Web Content Management 12

Business Process Management and Workflow 12

Records Management 12

Collaboration 13

Delivery 13

Search 13

Viewing 14

Transformation 14

Security 16

Summary 16

CHAPTER 2: THE SHAREPOINT 2010 PLATFORM 17

A Brief History of SharePoint 18

SharePoint 2010 18

Capability Categories 19

Sites 19

Composites 19

Insights 20

Communities 20

Content 21

Search 21

SharePoint Concepts 21

Architecture 23

Development Concepts 26

ECM in SharePoint 2010 27

Managed Metadata 27

Ratings 28

The Content Type Hub 28

Search 28

Workfl ow 28

Document Sets 28

Document IDs 29

Content Organizer 29

Records Management 29

Digital Asset Management 29

Web Content Management 29

Summary 30

PART II: PILLARS OF SHAREPOINT ECM

CHAPTER 3: DOCUMENT MANAGEMENT 33

What Is Document Management? 34

Microsoft SharePoint As a Document Management System 35

Document Taxonomy 35

Document Libraries 36

The Document Library Programming Model 36

Columns 39

The Column Programming Model 41

Content Types 44

The Content Type Programming Model 45

Managed Metadata 47

Administering Managed Metadata 48

Creating a Global Term Set 49

Using a Term Set in a Column 50

The Managed Metadata Programming Model 50

The Managed Metadata Service 55

Content Type Syndication 56

The Content Type Syndication Programming Model 57

Management of Managed Metadata Service Applications 58

Location-Based Metadata Defaults 60

Confi guring Location-Based Metadata Defaults 60

The Location-Based Metadata Defaults Programming Model 61

Metadata Navigation 62

Confi guring Metadata Navigation 63

Using Metadata Navigation 63

The Managed Metadata Navigation Programming Model 65

The Document ID Service 66

The Document ID Programming Model 68

Document Sets 69

Implementing Document Sets 69

Creating Custom Document Sets 70

Using Document Sets 70

The Document Set Programming Model 71

Document Control 73

Security 73

Managing Users and Groups 76

The Security Programming Model 76

Check-In/Check-Out 79

How to Check Out a Document 79

Programmatically Checking Out a Document 79

Versioning 81

How to Confi gure Versioning 81

Version History 81

Programmatically Interacting with Version History 82

The Audit Trail 83

The Content Organizer 84

Summary 85

CHAPTER 4: WORKFLOW 87

Workfl ow and ECM 87

Windows Workfl ow Foundation 88

WF Concepts 88

Activities 88

Workfl ow Modes 90

Persistence 90

The Role of Workflow in SharePoint 91

Workfl ow Scopes 92

Item 92

Site 93

Workfl ow Phases 93

Association 93

Initiation 93

Execution 93

Authoring and Workflow Types 94

Out-of-the-Box Workflows 94

The Approval Workflow 95

Declarative Workflows 99

Visio 99

SharePoint Designer Workflows 105

Visual Studio Workflows 114

Improvements 115

Creating a Workflow in Visual Studio: An Exercise 116

InfoPath 125

Out-of-the-Box Workflows 125

SharePoint Designer Workflows 125

Visual Studio 126

Pluggable Workflow Services 126

Why You Need Workflow Services 126

Authoring Custom Workflow Services 127

Workflow Event Receivers 130

Summary 131

CHAPTER 5: COLLABORATION 133

ECM and Collaboration 134

SharePoint Is Collaboration 134

Social Tagging 134

Tags 135

How to Create Tags 135

Tag Cloud 137

Notes 137

How to Create Notes 138

Ratings 139

Enabling Ratings for a Document Library or List 140

How to Rate an Item 140

Bookmarklets 141

Registering the Tags and Notes Bookmarklet 142

Creating Tags and Notes Using Bookmarklets 143

Privacy and Security Concerns 143

Tagging Programming Model 144

Working with Tags Programmatically 145

Working with Notes Programmatically 147

Working with Ratings Programmatically 149

My Sites 151

My Profi le 151

My Content 153

My Newsfeed 153

My Sites Architecture 154

Configuring My Sites 154

Configuring My Site Settings in the User Profile Service Application 156

Enabling the Activity Feed Timer Job 157

User Profiles 157

User Profile Policies 158

User Profile Programming Model 159

Working with a User Profile Programmatically 160

User Profile Service Application 165

People 166

Organizations 166

My Site Settings 167

Synchronization 168

Enterprise Wikis 168

Blogs 169

Microsoft Office Integration 170

SharePoint Workspace 170

Outlook Integration 171

Summary 172

CHAPTER 6: SEARCH 173

Introduction 173

Retrieval: The Key to User Adoption 174

The Corpus Profile 176

What Types of Documents Will Be Crawled? 176

Is an IFilter Available for Full-text Crawling All Document Types? 176

How Many of Each Document Type Will Be Crawled? 177

What Is the Average File Size By Document Type? 177

How Often Are Existing Documents Changed? 177

How Much New Content Will Be Added During a Specific Period of Time? 178

Impact of the Corpus Profile 178

Search Solutions 178

SharePoint Server 2010 Enterprise Search 180

Topology Components 180

Confi guration Components 184

The Search Center 189

Calling the Search API 203

FAST Search for SharePoint 2010 203

Functional Overview 203

Index and Query Processing Path 205

Search Architectures for SharePoint ECM 206

Sample Architectures 207

3-Million-Item Corpus 208

10-Million-Item Corpus 208

40-Million-Item Corpus 208

100-Million-Item Corpus 210

500 Million Documents 211

The Impact of Virtualization 211

Tuning Search Performance 211

Health Monitoring 212

Performance Monitoring 212

Improving Crawl Performance 213

Improving Query Performance 213

Summary 214

CHAPTER 7: WEB CONTENT MANAGEMENT 215

WCM Overview 215

Improvements in 2010 216

Authoring 216

AJAX 216

Accessibility 216

Markup Standards 217

Content Query Web Part 217

Cross-browser Support 217

Rich Media 217

Metadata 217

Spectrum of WCM in 2010 218

The SharePoint Server Publishing Infrastructure 218

Templates 218

Features 219

Security 221

Approve Permission Level 221

Manage Hierarchy Permission Level 222

Restricted Read Permission Level 222

Groups 222

Content Types 223

“Content” Content Types 223

Infrastructural Content Types 224

Site Content 225

The Anatomy of a Page 226

Master Pages 226

Page Layouts 227

An Exercise with Taxonomy and Layouts 227

Metadata 232

Content Query Web Part 233

Web Part Options 233

Query Options 233

Presentation 234

The Content Authoring Process 235

Authoring Web Content 235

Using the Content Organizer 238

Content Deployment 238

Workflow 239

Enterprise Wikis 239

Other Major Considerations 240

Branding 240

Navigation and Search 240

Targeting Global Users 241

Reporting and Analytics 241

Summary 242

CHAPTER 8: RECORDS MANAGEMENT 243

What Is Records Management? 244

Why Records Management Is Important 244

Microsoft SharePoint as a Records Management System 245

Records Management Planning 245

Identifying Roles 245

Analyzing Content 246

Developing a File Plan 247

Designing a Solution 247

Compliance and SharePoint 248

Managing Records 249

Recordization 250

In-Place Records Management 250

Records Center 253

Content Organizer 255

Workflow in Recordization 258

Programming Model for Recordization 259

Information Management Policy 263

Confi guring Information Management Policy 263

Exporting and Importing Policy Settings 266

Programming Model for Information Management Policy 267

Retention 267

Creating Retention Schedules 267

Programmatically Creating Retention Schedules 268

Auditing 270

Configuring Auditing 270

Reporting 271

Audit Reports 271

File Plan Report 272

eDiscovery 272

Summary 273

CHAPTER 9: DIGITAL ASSET MANAGEMENT 275

SharePoint Server 2010 Digital Asset Management Components 276

The Asset Library 276

Digital Asset Columns 276

Digital Asset Content Types 277

Media and Image Web Parts 278

Media Web Part and Field Control 278

Picture Library Slideshow Web Part 280

Image Viewer Web Part and Field Control 280

Content Query Web Part 280

Digital Asset Management Solution Scenarios 280

Marketing and Brand Management 281

Media Development Project 282

Online Training Center 283

Audio or Video Podcasting 284

Media Resource Library 284

Taxonomy Considerations 284

Storage Considerations 285

Managing Content Database Size 285

Remote BLOB Storage 286

Maximum Upload Size 286

Performance Optimization 287

BLOB Caching 287

Bit Rate Throttling 289

Summary 292

CHAPTER 10: DOCUMENT IMAGING 293

What Is Document Imaging? 294

SharePoint as a Document Imaging Platform 295

Setting Up the Scenario 295

Solution Data Flow Diagram 295

Model-View-ViewModel Primer 296

Creating a Simple WPF Capture Application 298

Architecture and Design 299

Implementation 299

Building the MVVM Infrastructure 299

Building the Target Dialog 300

Building the Main Window 301

Deployment 322

Creating a Simple Silverlight Viewer Web Part 322

Architecture and Design 323

Implementation 323

Building the Image Loader 323

Building the Imaging Web Service 324

Building the Imaging HTTP Handler 325

Building the Viewer Web Part 325

Making the Application Accessible from

JavaScript 326

Deployment 327

Deploying the Imaging Services 327

Deploying the Viewer Application as a Web Part 327

Setting Up the SharePoint Infrastructure 338

Confi guring SharePoint Search 338

Creating the SharePoint Content Type 339

Creating the SharePoint Document Library 340

Creating the SharePoint Web Part Page 340

Setting Up the SharePoint Web Part Page 341

Customizing the Advanced Search Box Web Part 341

Customizing the Search Core Results Web Part 343

The Solution from End to End 355

Summary 355

CHAPTER 11: ELECTRONIC FORMS WITH INFOPATH 357

Electronic Forms Overview 357

Is It a Form or an Application? 358

InfoPath Overview 358

What’s New in 2010 360

More InfoPath Fundamentals 360

Forms Services 360

Deploying Forms 361

Templates and Form Data 362

Rules 364

External Data 368

Custom Code 370

Publishing 371

Determining a Forms Strategy 372

Creating a Custom Form: An Exercise 374

Form Data and Layout 374

Form Rules 376

Form Submission 376

Publishing the Form 379

Summary 380

CHAPTER 12: SCALABLE ECM ARCHITECTURE 381

Storage Architecture, the Key to Performance 381

Performance Pitfalls 382

Too Few Disks in the Array 382

Shared SAN vs. DAS vs. NAS 383

Content Storage Size Factors 384

Database Storage and Capacity Planning 385

SQL Server Supporting Concepts 386

TempDB 390

Log Files 392

Crawl Databases 393

Content Databases 395

Property Databases 396

Service Application Databases 397

Management Databases 400

Prioritizing Disk I/O 400

Index Partition Storage 401

Storage Tuning and Optimization 401

Storage Performance Monitoring 401

Database Data File Management 402

Remote BLOB Storage 403

When to Implement an RBS Solution 405

RBS Provider Options 406

Backup and Restore Considerations 407

SQL Server Licensing Considerations 407

SharePoint 2010 Scalable Topology Design 408

Knowing the Users, the Corpus, and the Processes 408

Farm Size Defi nitions 409

The Case for Additional Web Servers 412

The Case for Additional Application Servers 412

The Case for Additional SQL Servers 412

Scalable Taxonomy Factors 413

Content Organization and Scalable Taxonomy 414

An Exercise in Scalable Taxonomy Design 415

Content Database Size Supported Limits 416

Performance and Resource Throttling 417

Summary 418

PART III: SHAREPOINT ECM SUPPORT CONCEPTS

CHAPTER 13: ECM FILE FORMATS 421

It’s Alive — Your Document, That Is 422

Microsoft Office Formats 422

Microsoft Office Binary 422

Office Open XML 423

Viewing and Editing Microsoft Office

Formats with Offi ce Web Apps 425

Word Automation Services 428

Open Document Format 437

Archive Formats 437

TIFF 438

OCR and iFilters 438

Markup 442

Development 442

PDF 442

OCR and iFilters 442

Markup 443

Development 443

Viewing and Editing 443

Living Document Conversion 444

PDF/A 444

Standardization 445

OCR and iFilters 445

Creating, Viewing, and Editing 446

XPS (Open XML Paper Specification) 446

OCR and iFilters 447

Markup 449

Development 449

Summary 450

CHAPTER 14: THE SHAREPOINT ECM ECOSYSTEM 451

The Microsoft Partner Ecosystem 451

Becoming a Partner 452

ISV/Software Competency 452

The SharePoint Ecosystem 453

Technical Community 453

ISV Solutions 454

ABBYY 455

AvePoint 458

GimmalSoft 460

KnowledgeLake 462

Nintex 465

Summary 467

CHAPTER 15: GUIDANCE FOR SUCCESSFUL ECM PROJECTS 469

Migrating to SharePoint 2010 470

Identifying Content for Migration 470

Extracting Content from the Source System 470

File Shares 470

Internally Developed Document Management Solutions 471

Other Legacy Document Management Solutions 472

Preparing Content for Importing 472

Setting the Content Type 472

Metadata Merge 472

Controlling the Import Process 473

General Metadata Cleanup 473

Importing Content into SharePoint 473

Protocols for Importing Content 474

Web Services 474

SharePoint Server Object Model 477

FrontPage Remote Procedure Calls (FPRPC) 479

Protocols for Updating SharePoint Content 480

SharePoint Server Object Model 480

SharePoint Client Object Model 483

Mapping Legacy ECM Features to SharePoint Solutions 486

Document Versions 487

Metadata-based Security 487

Document Protection and Rights Management 488

Annotations and Redaction 488

Search 488

Scanning and Indexing 488

Records Retention and Disposition 489

Workfl ow 489

Avoiding the Pitfalls 489

Capacity Planning 489

Illegal Characters in Metadata 489

Missing Required Data 490

Content Database Transaction Log Growth 490

Managing Upload File Size Restrictions 490

Upgrading a SharePoint ECM Farm to SharePoint 2010 491

Know Your Farm 491

SharePoint Portal Server 2003 and WSS v2.0 491

Microsoft Office SharePoint Server 2007 and WSS v3.0 492

Imaging or Archive-Only Farm with No Customization 492

Collaboration Farm with Customizations 492

Collaboration Farm with Large Imaging or Archive Site Collections 492

Summary 493

INDEX 495

Read More Show Less

Customer Reviews

Be the first to write a review
( 0 )
Rating Distribution

5 Star

(0)

4 Star

(0)

3 Star

(0)

2 Star

(0)

1 Star

(0)

Your Rating:

Your Name: Create a Pen Name or

Barnes & Noble.com Review Rules

Our reader reviews allow you to share your comments on titles you liked, or didn't, with others. By submitting an online review, you are representing to Barnes & Noble.com that all information contained in your review is original and accurate in all respects, and that the submission of such content by you and the posting of such content by Barnes & Noble.com does not and will not violate the rights of any third party. Please follow the rules below to help ensure that your review can be posted.

Reviews by Our Customers Under the Age of 13

We highly value and respect everyone's opinion concerning the titles we offer. However, we cannot allow persons under the age of 13 to have accounts at BN.com or to post customer reviews. Please see our Terms of Use for more details.

What to exclude from your review:

Please do not write about reviews, commentary, or information posted on the product page. If you see any errors in the information on the product page, please send us an email.

Reviews should not contain any of the following:

  • - HTML tags, profanity, obscenities, vulgarities, or comments that defame anyone
  • - Time-sensitive information such as tour dates, signings, lectures, etc.
  • - Single-word reviews. Other people will read your review to discover why you liked or didn't like the title. Be descriptive.
  • - Comments focusing on the author or that may ruin the ending for others
  • - Phone numbers, addresses, URLs
  • - Pricing and availability information or alternative ordering information
  • - Advertisements or commercial solicitation

Reminder:

  • - By submitting a review, you grant to Barnes & Noble.com and its sublicensees the royalty-free, perpetual, irrevocable right and license to use the review in accordance with the Barnes & Noble.com Terms of Use.
  • - Barnes & Noble.com reserves the right not to post any review -- particularly those that do not follow the terms and conditions of these Rules. Barnes & Noble.com also reserves the right to remove any review at any time without notice.
  • - See Terms of Use for other conditions and disclaimers.
Search for Products You'd Like to Recommend

Recommend other products that relate to your review. Just search for them below and share!

Create a Pen Name

Your Pen Name is your unique identity on BN.com. It will appear on the reviews you write and other website activities. Your Pen Name cannot be edited, changed or deleted once submitted.

 
Your Pen Name can be any combination of alphanumeric characters (plus - and _), and must be at least two characters long.

Continue Anonymously

    If you find inappropriate content, please report it to Barnes & Noble
    Why is this product inappropriate?
    Comments (optional)