Dowsing for Data
Reading Data Mining by Bhavani Thuraisingham is a poignant
experience. Thuraisingham is aware that the technology she expounds
has within it the potential to take away human freedom. Successfully
raising the issue, she fails to address it satisfactorily in an
otherwise masterful and readable summary of her field.
Data Mining is a scholarly work. The author commences from
an epistemological standpoint:
"The actual universe has the truth about all of the
entitites in the universe. The perceived universe is the people's
view of the universe. This view is usually determined by someone
or a group of people in authority."
Indeed. If the credit bureau demurs, you'll be rented no
apartment.
"For data modeling purposes, it is the perceived view of
the universe that is of interest. This is because the views of the
users of the database must be correctly reflected."
Clearly the credit bureau's perceived universe is more valid than
your's or mine, because they have paid for the data mining.
"For example, an intelligence agency could determine
abnormal behavior of its employees using this technology."
There are fair indications that they already track the behavior of
citizens using the Internet. Is being spied on by a chron process
more scientific than being tested for witchcraft by being tossed
bound into a river to see if you float?
Is it Safeway or Visa who knows best what to do with the record of
every prescription you ever purchased, or is it the DEA? How many
hits have you made, intentionally or inadvertently, on Web sites
containing pornography? Sites that mention legalizing marijuana?
Which offer abortion information? Addresses of gay support groups?
Guns for sale?
These concerns represent what the author calls the "social and
political" aspects that one should "note," in closing Chapter 13 on
"Security and Privacy." That chapter, by the way, is mostly about
maintaining the security and privacy of the data itself, not the
security and privacy of the lives it shadows. "We need the technology
first before we can enforce various policies and procedures,"
Thuraisingham concludes laconically.
Who is mining what inferences from what data? Data Mining
shrugs and turns to the more entertaining topic of deceiving "the
adversary" and making him doubt his data mining tool and its
inferences.
Data Mining is a profound overview of an important domain
of human knowledge, as well as a profound reminder, as if one more
were needed at the close of the twentieth century, that science is by
itself amoral and available to the highest bidder.
Data Mining is not an implementation book; we remain in the
domain of theory with a bibliography of practical works. If you are
looking for the broad contours of the field depicted by a
distinguished expert, Bhavani Thuraisingham, 1997 winner of IEEE's
Technical Achievement Award, has produced a memorable opus.
Dr. Dobb's Electronic Review of Computer Books