Analytics Septemberoctober 2013

Published on February 2017 | Categories: Documents | Downloads: 61 | Comments: 0 | Views: 427
of 85
Download PDF   Embed   Report

Comments

Content


H T T P : / / WWW. A N A L Y T I C S - MAGA Z I N E . O R G
ALSO INSIDE:
SEPTEMBER/ OCTOBER 2013 DRIVING BETTER BUSINESS DECISIONS
Information Asymmetry
Can decision science reduce market inefficiency?
Analytics Power Player
Interview with PAW founder & author Eric Siegel
‘Certify’ Your Career
CAP® designees comment on certification program
AWASH IN BIG DATA
HOW TO FIND SIGNIFICANT VALUE IN A SEA OF DATA
BROUGHT TO YOU BY:
Executive Edge
MarketShare CEO
Wes Nichols on using
marketing analytics math
WWW. I NF OR MS . OR G 2 | A NA LY T I CS - MAGA Z I NE . OR G
Key skills for analytics pros
I NSI DE STORY
Patrick Noonan earned an MBA from
the Yale School of Management, but it
wasn’t until he went to work for manage-
ment consulting firm McKinsey & Com-
pany that he really began to learn and
appreciate the “essential skills” for ana-
lytics professionals. Thirty years later,
Noonan, now a professor of Practice of
Decision & Information Analysis at the
Goizueta Business School at Emory Uni-
versity, has packed all of the “best prac-
tices” of consulting he has garnered over
the last three decades into one of the first
courses offered by the Institute for Op-
erations Research and the Management
Sciences’ (INFORMS) continuing edu-
cation program. The course, “Essential
Skills for Analytics Professionals,” will be
offered Sept. 26-27 in Redwood City, Ca-
lif., and Nov. 7-8 in Washington, D.C.
The skills Noonan is talking about
aren’t taught at Yale or most other busi-
ness or engineering schools, yet they
are critical in real-world problem-solv-
ing. According to Noonan, the “essen-
tial” analytics consulting skills include:
1) defining the client’s problem properly,
2) problem structuring and work plan-
ning, 3) managing a project or team
and, 4) making the case for change
and implementation of the analysis and
recommendations through persuasive
communication.
After a five-year stint at McKinsey,
Noonan returned to academia, earned a
Ph.D. in Decision Science from Harvard
University and joined the business school
faculty at Emory. He continued to consult
and teach executive education business
courses … and never forgot his McKinsey
experience and the “essential/professional”
skills he first learned there.
Recognizing a blind spot in his MBA stu-
dents’ education, he created a “professional
skills” workshop for his MBA students that
started as an extracurricular activity but has
since become a required course for Emory
MBAs in which he brings in consultants to
share their ideas and “best practices” with
MBA students of all stripes.
Noonan, who has taught more than 5,000
MBAs, describes the INFORMS continuing
education course on essential skills as an
expanded version of the course he teaches
at Emory, tailored for analytics profession-
als. For more information on the INFORMS’
Continuing Education Program and the “Es-
sential Skills for Analytics Professionals”
course, see page 18 or click here. ❙
– PETER HORNER, EDITOR
peter.horner
@
mail.informs.org
Brought to you by
WWW. I NF OR MS . OR G 4 | A NA LY T I CS - MAGA Z I NE . OR G
DRIVING BETTER BUSINESS DECISIONS
C O N T E N T S
FEATURES
TURNING BIG DATA INTO INFORMATION
By Paul Kent, Radhika Kulkarni and Udo Sglavo
High-performance analytics helps unlock significant, proprietary
value in big data.
THE INFORMATION ASYMMETRY PROBLEM
By Krishna Rupanagunta, Ajay Parasuraman and
Sourav Banerjee
Bridging the information gap: How decision science can reduce
market inefficiency.
PREDICTIVE ANALYTICS POWER PLAYER
By Peter Horner
PAW founder Eric Siegel discusses the power of predictive
analytics and his new book.
MESSY ANALYTICS? IT’S OK. WE’RE HOUSEWIVES!
By Gary Cokins
Information systems and technology are messy, so applying
analytics is an arduous task.
HOW WILL CERTIFICATION IMPACT YOUR CAREER?
Compiled by Gary Bennett and Peter Horner
Candid comments from CAP® designees underscore successful
launch of INFORMS program.
LOST IN TRANSLATION II
By Christopher Broxe and Fiona McNeill
Part two of a two-part series on best practices for analyzing
multi-lingual text.
32
38
46
54
60
68
54
38
32
SEPTEMBER/ OCTOBER 2013
Brought to you by
6 |
DRIVING BETTER BUSINESS DECISIONS
REGISTER FOR A FREE SUBSCRIPTION:
http://analytics.informs.org
INFORMS BOARD OF DIRECTORS
President Anne G. Robinson, Verizon Wireless
President-Elect Stephen M. Robinson, University of
Wisconsin-Madison
Past President Terry Harrison, Penn State University
Secretary Brian Denton,
University of Michigan
Treasurer Nicholas G. Hall, Ohio State University
Vice President-Meetings William “Bill” Klimack, Chevron
Vice President-Publications Eric Johnson, Dartmouth College
Vice President-
Sections and Societies Paul Messinger, University of Alberta
Vice President-
Information Technology Bjarni Kristjansson, Maximal Software
Vice President-Practice Activities Jack Levis, UPS
Vice President-International Activities Jionghua “Judy” Jin, Univ. of Michigan
Vice President-Membership
and Professional Recognition Ozlem Ergun, Georgia Tech
Vice President-Education Joel Sokol, Georgia Tech
Vice President-Marketing,
Communications and Outreach E. Andrew “Andy” Boyd,
University of Houston
Vice President-Chapters/Fora Olga Raskina, Con-way Freight
INFORMS OFFICES
www.informs.org • Tel: 1-800-4INFORMS

Executive Director Melissa Moore
Meetings Director Teresa V. Cryan
Marketing Director Gary Bennett
Communications Director Barry List

Headquarters INFORMS (Maryland)
5521 Research Park Drive, Suite 200
Catonsville, MD 21228
Tel.: 443.757.3500
E-mail: [email protected]
ANALYTICS EDITORIAL AND ADVERTISING
Lionheart Publishing Inc., 506 Roswell Street, Suite 220, Marietta, GA 30060 USA
Tel.: 770.431.0867 • Fax: 770.432.6969
President & Advertising Sales John Llewellyn
[email protected]
Tel.: 770.431.0867, ext.209
Editor Peter R. Horner
[email protected]
Tel.: 770.587.3172
Art Director Lindsay Sport
[email protected]
Tel.: 770.431.0867, ext.223
Advertising Sales Sharon Baker
[email protected]
Tel.: 813.852.9942
Analytics (ISSN 1938-1697) is published six times a year by the
Institute for Operations Research and the Management Sciences
(INFORMS), the largest membership society in the word dedicated
to the analytics profession. For a free subscription, register at
http://analytics.informs.org. Address other correspondence to
the editor, Peter Horner, [email protected]. The
opinions expressed in Analytics are those of the authors, and
do not necessarily refect the opinions of INFORMS, its offcers,
Lionheart Publishing Inc. or the editorial staff of Analytics.
Analytics copyright ©2013 by the Institute for Operations
Research and the Management Sciences. All rights reserved.
DEPARTMENTS
Inside Story
Executive Edge
Analyze This!
INFORMS Initiative
Forum
Analytics & Healthcare
WSC 2013 Preview
Five-Minute Analyst
Thinking Analytically
2
8
14
18
22
26
78
80
84
78
WWW. I NF OR MS . OR G 8 | A NA LY T I CS - MAGA Z I NE . OR G
Recently, a long-time MarketShare client joined a
new Fortune 50 company as its CMO. When he in-
quired how the “marketing mix” models it was using
were factoring in digital data, the answer shocked him:
“Digital is out of scope.” Dumbfounded, he probed fur-
ther and found that the model builders were unable
to include online/offine effects, so simply left digital
out. But that seems tenuous considering the compa-
ny spends some 30 percent of its marketing budget
on digital, and its products involve highly considered
purchases with much digital activity.
It’s true, of course, that marketing and math
haven’t always meshed. Marketing was considered a
creative endeavor somehow divorced from the rigor
and transparency of science-based business pro-
cess. The very defnition of analytics – “the scientifc
process of transforming data into insight for making
better decisions” – caused tension in marketing-dom.
For many, “marketing science” was an oxymoron.
But no more. Better analytic methods, cloud-
based technology, new C-suite thinking and – yes
– big data have changed all that. In our new data-
driven marketing world, art and science not only
coexist in marketing, they actually complement one
Good quant, bad quant
BY WES NICHOLS
The cornerstone of today’s
successful marketing
analytics technology – or
what I call “Analytics 2.0”
– is the math.
How to tell accurate analytics methods from ‘quant quackery.’
EXECUTI VE EDGE
WWW. I NF OR MS . OR G 10 | A NA LY T I CS - MAGA Z I NE . OR G
EXECUTI VE EDGE
Join the Analytics Section of INFORMS
For more information, visit:
http://www.informs.org/Community/Analytics/Membership
another. [This Art vs. Science in Market-
ing Analytics video with former CMOs of
P&G, Sony and a leading marketing sci-
entist, makes the point.]
GOOD MATH IS THE ANALYTICS
CORNERSTONE
The cornerstone of today’s success-
ful marketing analytics technology – or
what I call “Analytics 2.0” – is the math.
If you don’t have the math right, by def-
nition your attribution will be wrong, and
by extension your allocations and at-
tempts to optimize your investments will
also be wrong. It’s critical to understand
that effective marketing resource alloca-
tion depends on accurately attributing
revenue to different marketing invest-
ments online as well as offine and at
point of purchase.
The 1.0 version of marketing ana-
lytics includes traditional forms of mea-
surement that we’ve had for decades,
such as media mix models, agent-based
models, digital attribution and simple cor-
relations using Excel spreadsheets. Ana-
lytics 2.0 taps today’s perfect storm of big
data, technology, predictive analytics and
other marketing science to help compa-
nies reallocate billions of advertising dol-
lars while realizing double-digit sales lifts
with zero additional spend.
Advanced analytics in marketing can
hone in on hundreds of a given compa-
ny’s business drivers, from pricing, distri-
bution and online reviews, to social media
chatter, advertising and hard sales data
to uncover critical insights about what’s
really driving results, and what to do next
in the real world. The allocation step is
where you put what you’ve learned from
attribution and testing into play. Then you
can quickly measure outcomes, validate
models by running real-time tests, and
make course corrections to optimize al-
locations and results. (See my March
2013 Harvard Business Review cover
story “Advertising Analytics 2.0” (abstract
here).
SPOTTING ‘QUANT QUACKERY’
Surprisingly, many big brands are
still using largely discredited simple mar-
keting mix econometric methods, or for
online marketing “last click attribution.”
Such models aren’t looking at the total
S E P T E MB E R / OCT OB E R 2013 | 11
A NA L Y T I C S
ecosystem, nor are they measuring the
precise impact of, say, TV on search, or
search’s impact on retail sales. Simply
plugging offine spend into digital mar-
keting analytics models doesn’t achieve
“cross-channel” analytics.
Using fawed models is like crediting
a single movie theater for an Academy
Award winning performance, or trying to
win a football game with just eight play-
ers on the feld. Unfortunately, while the
buzzword “attribution” is everywhere
these days, many of the solutions trying
to solve for this challenge are sub-par.
Yet models that play without all the piec-
es are little more than what we might call
quantitative quackery.
For example, to get a true, holistic
view of what’s going on and thus make
better business decisions, you need all
forms of digital (search, social and mo-
bile) in the analytics. Without it you’re
missing a rich vein of information about
consumer behavior. And remember,
even if you don’t spend much or even
any money on digital doesn’t mean on-
line behavior – the consumer’s “digital
life” – isn’t infuencing the decision-mak-
ing process.
But that’s just one quant quackery
component. Relying heavily on old-school
data “samples” is another. Samples may
still have a place, but part of big data’s
beauty is the ability to use all the data from
online and offine marketing and sales
channels, plus external factors (such as
the weather or unemployment), not just
samples that are far more error-prone.
Still more quant quackery occurs
when marketing analytics focus on attri-
bution for only a small piece of the overall
enterprise. The “better decision-making”
that analytics promises must factor in
enterprise-wide relationships. It’s an ad-
vanced form of an old connect-the-dots
exercise, only in this version you have to
include dots for activities and outcomes
WWW. I NF OR MS . OR G 12 | A NA LY T I CS - MAGA Z I NE . OR G
you might not be able to actually see, but
which exist nonetheless in the data.
Other potential pitfalls lurk in new or
unproven measurement approaches
such as certain efforts to measure social
stream ROI, agent-based models, ma-
chine learning and others. These methods
are suspect since you can often make the
models say whatever you want them to.
QUANTIFYING MARKETING’S
BUSINESS IMPACT
Big Data without the right math-based
analytics is a Big Problem. The “right” ana-
lytics are essential to bringing big data to
life for marketing organizations, thus allow-
ing for faster insights and better decision-
making. This includes such things as:
• quantifying the long-term impact of
brand advertising (brand equity);
• a holistic approach that includes
all online and offine methods and
channels;
• deploying the latest technology, not
simple regression models; and
• transformational thinking that takes
marketing analytics beyond simple
“research project” status toward
enterprise-wide adoption.
Consider one of our auto sector clients,
a global manufacturer that’s a superstar
in the world of marketing analytics. Their
cross-functional analytics team has the
daunting task of making sure the compa-
ny spends its $1 billion marketing budget
as effectively as possible while contribut-
ing to business goals, achieving the best
ROI and increasing shareholder value.
They do it with advanced analytics
that allow the company to run continu-
ous marketing strategy simulations un-
der a wide range of complex variations.
These simulations employ cross-media
attribution insights that help the company
predict with greater accuracy than ever
how changing the amount spent in one
marketing area will likely impact the per-
formance of advertising elsewhere, and
what this all does for the bottom line.
Using advanced analytics, this For-
tune 20 marketer has also been able to
coordinate local and national marketing
and dealer incentive budgets, and sim-
ply by shifting allocations generate tens
of millions of dollars in new revenue from
the same spending level.
Almost any company can deploy Ana-
lytics 2.0, focusing on marketing analyt-
ic methods that avoid the pitfalls noted
above. But one thing is sure: Businesses
that don’t will be left behind. ❙
Wes Nichols is co-founder and CEO of
MarketShare, which provides advanced marketing
analytics technology for Global 1000 brands. For
independent analysis of competitors in this space
see the latest Forrester Wave Report, available
here. In this video, Nichols offers a quick overview
of Analytics 2.0.
EXECUTI VE EDGE
WWW. I NF OR MS . OR G 14 | A NA LY T I CS - MAGA Z I NE . OR G
Daniel Kahneman is a psychologist who was
awarded the 2002 Nobel Prize for his infuence on
the burgeoning feld of behavioral economics. I re-
cently read his bestselling 2011 book “Thinking Fast
and Slow” [1]. The book begins with a set of chap-
ters collectively entitled “Two Systems.” This is where
the book’s title comes from: System 1 [the “Thinking
Fast” from the book’s title] “operates automatically
and quickly, with little or no effort and no sense of
voluntary control,” while System 2 [“Thinking Slow”] is
engaged in “the effortful mental activities that demand
it, including complex computations …” [2].
Kahneman then proceeds to illustrate how these
Systems interact. He presents several examples in
which System 1’s assessment processes are simplistic
and biased. System 2, while capable of making much
better decisions, is shown to be “lazy” as a result of the
volume and variety of demands that leave it in a busy
and depleted state. The tendency toward lazy System
2 processes, it turns out, is also why so many people
turn out to be quite unskilled at probabilistic reason-
ing and associated decision-making; it is simply much,
much easier for System 1’s automatic (and often incor-
rect) heuristics to be deployed than for System 2 to
break away from its many other demands.
My System 2 was exhausted by the time I fnished
“Thinking,” so I simply started reading the next book
A tale of two books on
decision-making
BY VIJAY MEHROTRA
The tendency toward
“lazy System 2“
processes is why so
many people turn out
to be quite unskilled at
probabilistic reasoning
and associated
decision-making.
ANALYZE THI S!
S E P T E MB E R / OCT OB E R 2013 | 15 A NA L Y T I C S
that was sitting on my nightstand, which
was Eric Siegel’s “Predictive Analytics:
The Power to Predict Who Will Click, Buy,
Lie, or Die” [3]. Siegel is a former com-
puter science professor, an experienced
analyst and more recently the founder
of the Predictive Analytics World confer-
ence series. As its title suggests, he has
written a book that focuses on data-driv-
en predictions, which he collectively la-
bels as “predictive analytics” (PA).
The centerpiece, or rather centerfold,
of the book is a list of more than 100 suc-
cess stories that involve PA, grouped into
categories ranging from “Financial Risk
and Insurance” to “Family and Personal
Life.” In turn, each chapter tells its own
tale through these PA success stories.
For example, in the chapter that explores
the ethical and privacy implications of
using data for prediction (“With Power
Comes Responsibility”), Siegel illus-
trates the key ideas through the story of
HP’s model for predicting the likelihood
of employees leaving the company and
Target’s algorithm and processes for pre-
dicting which customers were likely to be
pregnant, while in the last chapter (“Per-
suasion by the Numbers”) he shines a
bright light on U.S. Bank, Telenor (a Nor-
wegian telecommunications company)
and the Obama 2012 campaign.
When I started reading this book, I
really did not know much about machine
learning and so it was a nice window
into its key ideas (Siegel views ma-
chine learning and predictive analytics
as almost synonymous). Though this is
a decidedly non-technical book, Siegel
provides a clear illustration of the con-
cepts behind machine learning, while
also naturally introducing some of its ter-
minology (“training sets,” “test sets,” “en-
semble methods,” etc.).
Despite Siegel’s best efforts at levity
– he routinely makes puns, includes pop
culture references and even offers up
his own tongue-in-cheek poetry to keep
things light – my head began to hurt after
reading a few chapters. At frst, I could
not fgure out what my problem was, but
eventually, it came to me: the juxtaposi-
tion of these two books was the source
of the pain.
In the introduction to “Thinking,” Kahn-
eman writes that “this book presents my
current understanding of judgment and
decision-making, which has been shaped
by psychological discoveries of recent
decades”; indeed, many of these discov-
eries were his own, and it appears that
the desire for continuing to improve this
understanding still drives him to this day.
Siegel, on the other hand, describes the
world of predictive analytics as ruthlessly
pragmatic: “We usually don’t know about
causation, and we often don’t necessar-
ily care…prediction trumps explanation.”
WWW. I NF OR MS . OR G 16 | A NA LY T I CS - MAGA Z I NE . OR G
Kahneman clearly relished the regu-
lar and frequent informal conversations
through which he and the late Amos Tver-
sky (with whom he published many seminal
papers prior to Tversky’s death in 1995 and
to whom “Thinking” is dedicated) evolved
their thinking. For his part, Siegel writes with
excitement of “one of the coolest things in
science, the most audacious of human am-
bitions: the automation of learning.”
The theories and discoveries that Kahn-
eman describes nearly always involved
intricate and careful data collection to test
specifc hypotheses that were framed with-
in a large body of previous research, while
Siegel asserts that, “PA’s aim isn’t only to
assess human hunches…but also to ex-
plore a boundless playing feld of possible
truths beyond the realms of intuition.”
When I fnally fnished Siegel’s book,
there was no doubt in my mind that there
is a bright and growing future for predic-
tive analytics, and that Siegel is a ca-
pable and passionate spokesman with a
compelling vision for this future. Indeed,
his book illustrates that one virtue of PA
in the business world is that it provides
objective insights based on data, rather
than simply leaving managers and ex-
ecutives free to make complex decisions
by intuition alone. As it happens, several
Kahneman and Tversky papers [4] have
had a huge impact on exposing the very
real limitations in the quality of intuitive
decision-making, which has helped legiti-
mize the practice of predictive analytics.
Yet if the future belongs to analytics
professionals, as Siegel and many oth-
ers are fond of suggesting these days, I
do hope that they understand not only the
power of the computer to make discover-
ies by relentlessly sifting through increas-
ingly large piles of data but also the value
of developing deep domain knowledge,
participating in inspired discussions, and
being both persistently curious and curi-
ously persistent. These lessons, in addition
to their many insights into the psychology
of decision-making, are also a huge part of
Kahneman’s (and Tversky’s) legacy. ❙
Vijay Mehrotra ([email protected]) is
an associate professor in the Department of
Analytics and Technology at the University of San
Francisco’s School of Management. He is also an
experienced analytics consultant and entrepreneur,
an angel investor in several successful analytics
companies and a longtime member of INFORMS.
ANALYZE THI S!
REFERENCES & NOTES
1. http://www.amazon.com/Thinking-Fast-Slow-
Daniel Kahneman/dp/0374533555/ref=sr_1_1?ie=
UTF8&qid=1377194045&sr=8-1&keywords=thinki
ng+fast+and+slow
2. Kahneman credits Keith Stanovich and
Richard West with developing the “System
1-System 2” terminology.
3. http://www.amazon.com/Predictive-Analytics-
Power-Predict-Click/dp/1118356853
4. Most notable among these papers: Tversky,
A. and Kahneman, D., 1974, “Judgment under
Uncertainty: Heuristics and Biases,” Science,
Vol. 185(4157), p. 1,124-1,131. This article is
reprinted in full at the end of “Thinking Fast and
Slow.”
WWW. I NF OR MS . OR G 18 | A NA LY T I CS - MAGA Z I NE . OR G
Know anyone in the high-end analytics profession
who says they learned everything they needed to
know in kindergarten? Not likely. In fact, with technol-
ogy changing, few analytics professionals can even
say they learned everything there was to learn in grad
school.
With the need for analytics professionals to con-
tinually upgrade their skills, INFORMS recently intro-
duced the frst two in what will be a series of continuing
education courses to let operations researchers and
analytics professionals update their skills and intro-
duce new concepts.
The initial courses are: “Essential Skills for Analyt-
ics Professionals” and “Data Exploration and Visual-
ization.” More courses will be available in 2014.
To ensure that the courses are valuable and that
they help professionals do their jobs better and help
them qualify for career advancement, INFORMS
identifed courses that provide vital job skills. The
courses will be two days in length, offered in a class-
room setting and held in conjunction with INFORMS
INFORMS launches
continuing education
program
BY BARRY LIST (LEFT)
AND THEDRA WHITE
To ensure that the courses
are valuable and that
they help professionals do
their jobs better and help
them qualify for career
advancement, INFORMS
identified courses that
provide vital job skills.
First course offerings include “Essential Skills for Analytics Professionals”
and “Data Exploration & Visualization.”
I NFORMS I NI TI ATI VE
S E P T E MB E R / OCT OB E R 2013 | 19 A NA L Y T I C S
meetings, as well as offered in major
cities at specialized training facilities,
including those at company and gov-
ernment locations. The courses will be
offered initially in the United States, but
INFORMS has plans to make the cours-
es available internationally in numerous
countries.
Taking note of the increasing number
of courses that are now offered online,
INFORMS is exploring the development
of online courses, with some given in real
time and others available on demand.
This online component is slated for a
2014 rollout.
Students who complete the courses
and pass the exams will receive profes-
sional development units (PDUs).
Conscious of the need for improve-
ment, INFORMS will monitor reactions
to the courses and use the feedback to
make modifcations so that students get
the most possible from their instruction.
A closer look at the two initial courses
that have been scheduled:
COURSE: ESSENTIAL SKILLS FOR
ANALYTICS PROFESSIONALS
Participants will learn practical tools
for integrating their analytical skills into
real-world problem solving for business-
es and other organizations. The course
provides approaches that can be applied
immediately to a wide variety of settings,
whether within a participant’s own orga-
nization or for an external client.
By the end of the course, participants
will:
• learn to link their subject-matter
expertise to the challenges of
messy, unstructured problems,
organizational noise and non-
technical decision makers; and
• understand best-practice
techniques, including: problem
statement summaries, issue trees,
interview guides, work plans,
sensitivity analysis, stress-testing
recommendations, the “Pyramid
Principle” of story logic, story-
boarding, slide-craft, delivering
presentations and felding questions
and answers.
Faculty: Patrick S. Noonan
Patrick S. Noonan is professor of
Practice of Decision & Information Analy-
sis and associate dean for Management
Practice Initiatives at the Goizueta Busi-
ness School of Emory University. From
1996-2000 he also served as assistant
dean and director of MBA programs. He
has been a visiting professor at Duke
WWW. I NF OR MS . OR G 20 | A NA LY T I CS - MAGA Z I NE . OR G
University’s Fuqua School of Business,
and he has taught short courses at Aalto
University (Helsinki), ESAN (Lima) and
Universidad ORT (Montevideo).
Noonan began his academic career
on the faculty of the Harvard Business
School, where he received his Ph.D. in
decision sciences.
Noonan’s traditional coursework at
Emory – which includes decision mod-
eling, game theory and data analysis
– has earned the Distinguished Educa-
tor Award 12 times and “Last Lecture”
speaker role six times.
Course dates and locations: Sept. 26-
27, 9 a.m.-4:30 p.m., Seaport Confer-
ence Center, 459 Seaport Ct., Redwood
City, CA 94063; and Nov. 7-8, 9 a.m.-
4:30 p.m., Gestalt Partners, 1325 G St.,
NW, 10th Floor, Suite 1020, Washington,
DC 20005
Editor’s note: For more on Noonan and
the “essential skills” course, see “Inside
Story” in this issue of Analytics magazine.
COURSE: DATA EXPLORATION &
VISUALIZATION
Participants in this course can ex-
pect to be re-introduced to approach-
ing data in a powerful yet playful
manner. They will see and experience
how exploration and visualization can
corroborate existing hunches and
questions, reveal unexpected patterns,
as well as stimulate new perspectives
and insights.
During the course various commer-
cial software will be used. The focus is on
understanding the underlying methodol-
ogy and approach into how data should
be approached, handled, explored, and
incorporated back into the domain of
interest.
At the end of the course, participants
will:
• have confdence to explore new data
using the exploration/visualization
approach;
• be able to approach and deploy
interactive visualization;
• understand how to identify practically
meaningful discoveries;
• experience using state-of-the-art
visualization software; and
• think more creatively about data and
insights.
Faculty: Galit Shmueli and David R.
Hardoon
Galit Shmueli is SRITNE Chaired
Professor of Data Analytics and associ-
ate professor of Statistics & Information
Systems at the Indian School of Busi-
ness. She is best known for her research
and teaching in business analytics, with
a focus on statistical and data mining
I NFORMS I NI TI ATI VE
S E P T E MB E R / OCT OB E R 2013 | 21
A NA L Y T I C S
methods for contemporary data and ap-
plications in information systems and
healthcare.
Shmueli’s research has been pub-
lished in the statistics, management,
information systems, and marketing lit-
erature. She authors over seventy jour-
nal articles, books, textbooks and book
chapters, including the popular textbook
Data Mining for Business Intelligence
and Practical Time Series Forecasting.
Shmueli is an award-winning teacher
and speaker on data analytics.
She has taught at Carnegie Mellon
University, University of Maryland, the
Israel Institute of Technology, Statistics.
com and the Indian School of Business.
David R. Hardoon is associate di-
rector of Business Analytics at Ernst &
Young Singapore, Advisory Services.
He is leading the analytics practice and
is responsible for the positioning of busi-
ness analytics advisory and services to
clients across different business sectors.
He is also an adjunct faculty member of
the School of Information Systems and
Singapore Management University in
Singapore and an honorary senior re-
search associate at the Centre for Com-
putational Statistics & Machine Learning,
University College London in the United
Kingdom.
Hardoon has been engaged at various
conferences and workshops to speak on
research and business related topics in
machine learning, data mining and busi-
ness analytics. He also regularly tutors,
advises and provides consulting support
in his feld of expertise – analytics and
business analytics.
For more information, visit the IN-
FORMS website (www.informs.org)
or going directly to https://www.in-
forms.org/Certification-Continuing-Ed/
INFORMS-Continuing-Education.
Course dates and locations: Sept,
30-Oct. 1, 9 a.m.-4:30 p.m., Seaport
Conference Center, 459 Seaport Ct.,
Redwood City, CA 94063; Oct. 3-4, 9
a.m.-4:30 p.m., Minneapolis Marriott
City Center, 30 South 7th Street, Min-
neapolis, MN 55402.
Learn more about both initial courses
and the INFORMS Continuing Education
program by visiting the INFORMS website,
www.informs.org; going directly to https://
www.informs.org/Certification-Continu-
ing-Ed/INFORMS-Continuing-Education;
or by contacting Continuing Education
Program Manager Thedra White: e-mail:
[email protected]; phone: 1-800-
446-3676 or 443-757-3570. ❙
Barry List ([email protected]) is director of
Communications for INFORMS. Thedra White
([email protected]) is manager of the
INFORMS Continuing Education Program.
WWW. I NF OR MS . OR G 22 | A NA LY T I CS - MAGA Z I NE . OR G
Big Data and how its use is reshaping everything
from marketing, customer service, sales and even na-
tional security, flls today’s headlines. According to a
McKinsey Global Institute study (2011) the explosion
of analytical work is creating a shortage of available
workers in the feld, and the search is on for the next
generation of top analytic talent. McKinsey states that
by 2018, the U.S. alone could face a shortage of up
to 190,000 workers with deep analytical skills, as well
as 1.5 million managers and analysts with the know-
how to use the analysis of big data to make effective
decisions.
So with analytics exploding how do you jump to
the front of the line and land that top analytics job?
We’ve determined three main qualities make up
the perfect analytics candidate. First, they need the
obvious strong quantitative background in math, sta-
tistics, physics, industrial engineering or computer
science. But it’s also important to have both business
insights and strong communications skills.
I work for Enova, an online lender that uses the
power of technology and analytics to offer credit al-
ternatives to more than two million individuals world-
wide. At our company, analytics teams are embedded
How to land that top
analytics job
BY ADAM McELHINNEY
Reach for the future.
It’s important to find a
company with strong
prospects for learning and
growth.
Complement technical know-how with business insight and communica-
tions skills.
FORUM
S E P T E MB E R / OCT OB E R 2013 | 23 A NA L Y T I C S
directly into the business, frequently in-
teracting with marketing teams, strategy
teams and executives. They can do the
analytics and then turn that model into
an actionable business solution that they
can communicate to stakeholders. Good
people skills and the ability to collaborate
in a business environment make a candi-
date stand out.
With that as a foundation, here’s
some advice on how to make a great
impression in the interview and land that
top analytics job you want:
1. Reach for the future. It’s important to
fnd a company with strong prospects
for learning and growth. I would take a
lower-paying job with a fast-growing com-
pany over a higher-paying job with a low
or negative growth company. When con-
sidering positions, think fve to 10 years
out. It’s important to think about which job
will make you better off 10 years from now
as opposed to better off next month.
2. Plus-up your resume with certifca-
tions and competitions. Certifcations
WWW. I NF OR MS . OR G 24 | A NA LY T I CS - MAGA Z I NE . OR G
from INFORMS, SAS, CFA or other actu-
arial exams nicely complement your ed-
ucational background. Also, Kaggle and
other statistics or data mining competi-
tions are a great way to show employers
what you can do.
3. Open-source software. For the kinds
of analytics work we do at Enova, experi-
ence with open-source is a great asset.
Contributing to one of the open-source
data analysis tools that are available re-
ally makes a positive impression.
4. Show work samples. Academic pa-
pers and samples of a project you worked
on are also valuable. In addition to your
resume, show off your portfolio and be
prepared with short yet exciting explana-
tions of your projects.
5. Develop a logical approach to in-
terviews. It is important to demonstrate
the ability to frame problems in a logical
fashion. In discussions of actual work
projects during the interview process,
even if you don’t have experience in a
particular feld or type of project, outline
the approach that will lead to a business
solution.
6. Show your analytical mind at work.
Ask questions during interviews, mean-
ingful questions that show your inquisitive
approach. Ask about the company cul-
ture. Ask the interviewer about his or her
personal career path and how someone
advances in the company. In some com-
panies, employees cannot advance until
a certain amount of time has passed or
they reach a certain milestone. Ask about
turnover percentage. Is it a place where
people have been for 10 years or where
they stay a year or two and then leave?
It’s important to uncover this information
before you take a job somewhere. Not
asking questions is actually a red fag with
someone interviewing for an analytics job.
One piece of caution: Don’t ask
overly technical questions, like whether
an interviewer prefers one really spe-
cifc modeling technique over another.
This usually doesn’t impress them and is
more a missed opportunity to ask a more
meaningful question. ❙
Adam McElhinney is the head of business
analytics at Chicago-based online lender Enova
International. His 25-person analytics team is
composed of Ph.D.s in industrial engineering,
astrophysics and physics, as well as individuals
with advanced degrees in statistics, mathematics
and computer science. You can reach McElhinney
via LinkedIn. He is a member of INFORMS.
FORUM
Help Promote Analytics Magazine
It’s fast and it’s easy! Visit:
http://analytics.informs.org/button.html
WWW. I NF OR MS . OR G 26 | A NA LY T I CS - MAGA Z I NE . OR G
The value of analytics in healthcare has never
been questioned. But pundits agree that healthcare is
behind the curve when it comes to using analytics for
unleashing powerful insights that can improve quality
of care, lower cost and engage patients. They also
agree that analytics can unlock tremendous value for
all stakeholders in the healthcare value chain. But the
key question remains: Who will champion that in the
healthcare industry? Who has the best motivation?
But before I try to get to that question let’s consider a
few possible use cases in healthcare that can beneft
from the implementation of analytics. This is not an
exhaustive list.
Population health management with preemptive
clinical intervention. Population health management
is a well-known term among healthcare stakeholders,
but barring a few localized successes, implementa-
tion of population health management principles at
scale remains elusive. Availability of adequate data
from a bigger population is still a challenge. Interop-
erability of IT systems and data interchange is a
monumental task. But if barriers are removed and
data liquidity is enabled, analytics-driven population
Opportunities, barriers &
champions
BY RAJIB GHOSH
Population health
management is a well-
known term among
healthcare stakeholders,
but … implementation
of population health
management principles at
scale remains elusive.
Despite its acknowledged value, healthcare’s use of analytics lags
behind other industries.
ANALYTI CS & HEALTHCARE
S E P T E MB E R / OCT OB E R 2013 | 27 A NA L Y T I C S
health management can potentially pre-
dict possible exacerbation of patients in
advance, triggering preemptive clinical
intervention.
Prevention of hospital readmission.
Until last year the national average of
hospital readmission rates for Medicare
population held steady at slightly above
19 percent, thereby increasing hospital
costs. In 2010 the cost of Medicare re-
admissions reached $17.5 billion. Inter-
estingly, while some readmissions are
inevitable, a 2007 analysis by Medicare
Payment Advisory Commission (Med-
PAC) suggests that 76 percent of 30-day
readmissions of Medicare population
are preventable when appropriate inter-
ventions are applied ahead of time. Al-
though a recent study argues that this
number may not be accurate, there is a
consensus that some readmissions can
be prevented with appropriate quality of
care measures. Analytics can determine
patient cohorts that are vulnerable to re-
admission and, based on past history,
recommend appropriate measures dur-
ing the discharge process.
Reporting data for better public health
reporting and research. In Stage
2 meaningful use, Centers for Medi-
care & Medicaid Services (CMS. www.
cms.gov) requires providers to report
syndrome-based surveillance data and
immunization registries to public health
agencies. Analytics can be utilized to
determine which conditions are worth re-
porting. When such data from a large set
of providers become available then com-
prehensive research can be conducted
based on the combination of demograph-
ic and episodic data.
Personalized medicine. No two pa-
tients are the same. Standard of care for
a patient should ideally be unique and
tailored based on his or her history and
biology. That’s not how medicine is prac-
ticed today. As genetic data becomes
more freely and cheaply available, pow-
erful analytics platform will be able to
generate a more refned and personal-
ized standards of care for patients.
With that said, let’s take a quick look
at the key barriers of applying analytics
to the healthcare value chain.
Lots of data. Lots of silos. A huge vol-
ume of healthcare data is available, mak-
ing the industry an ideal vertical for the
application of analytics. But most data
sets are bound in their own silos of sys-
tems that seldom interoperate. Break-
ing down the silos is a key requirement
before analytics can produce desired
insights. With the Accountable Care
Act (ACA) and healthcare reform, more
WWW. I NF OR MS . OR G 28 | A NA LY T I CS - MAGA Z I NE . OR G
focus has been given to the interopera-
bility of systems. Several data hubs such
as Health Information Exchanges (HIE)
were set up during the last few years with
government funds. Standards are still in
the making, but recently the electronic
medical record (EMR) industry has come
forward to create an alliance, Common-
Well Health Alliance, to push forward the
agenda of interoperability. That certainly
is a step in the right direction.
Who pays? Implementing analytics to
assist decision support in care settings
or public health is no trivial task. Not a
cheap task either! Of course, we do not
need to boil the ocean to get started. Still,
money is scarce for most stakeholders
in healthcare (see Figure 1). Pharma-
ceutical companies make the most proft
followed by medical device makers. But
do they have incentives to ignite analyt-
ics-driven healthcare delivery?
Machine algorithm vs. human brain –
which is better? Algorithmic clinical de-
cision support is viewed with skepticism
in the clinical community. The analytics
industry has not proven convincingly with
evidence that an analytics-driven medi-
cal expert system can produce better
clinical care pathways for patients with
ANALYTI CS & HEALTHCARE
Figure 1: Proft margins in healthcare.
Source: SEC 2010 data, AHIP Coverage
WWW. I NF OR MS . OR G 30 | A NA LY T I CS - MAGA Z I NE . OR G
ANALYTI CS & HEALTHCARE
complex diseases. After all, despite the
large set of knowledge that the machine
can assimilate and retain, human clini-
cians drive the decision-making process
of the expert system, or prescriptive ana-
lytics, and they have their own individual
biases. Do we need a healthcare jeop-
ardy game to prove that predictive and
prescriptive analytics can work better
than humans?
Patient or proft – who is at the center?
Dr. Joe Kvedar, director of the Center for
Connected Health, raised a pertinent
question in his recent blog: Who benefts
more if consumers get a “Google Now”-
like personalized prevention service that
is just-in-time, predictive, preemptive
and personalized? Google Now analyz-
es huge amount of data from an user’s
searches, e-mails, calendars and other
Web visits to fgure out what the user is
going to do next. Then it presents context
sensitive information without the user do-
ing any look up. Surely, breaking down
data silos and powerful analytics can en-
able such a paradigm in healthcare but
outside of the patient community which
organizations have real incentives to do
that? A recent New York Times article
elaborated this dichotomy – healthcare
and profts are a poor mix! After all, every
dollar taken out of the healthcare system
comes out of someone’s coffers.
Insurance payers have shown stra-
tegic initiatives to aggregate more data
and building analytics to leverage that.
Payers such as United Healthcare, Aet-
na and WellPoint have either acquired
technology companies (e.g., Aetna
bought Medicity and United Health-
care bought Ingenix and Humedica)
or built partnerships with technology
companies (e.g., WellPoint and IBM)
to offer analytics-as-a service to pro-
vider organizations. It certainly helps
payers to mitigate their risk exposures,
prevent costly hospitalization episodes
and avoid unnecessary procedures, lab
tests and prescribed pharmaceuticals.
They can even improve their medical
loss ratio (MLR) by investing in tech-
nology driven wellness and prevention
initiatives. But it remains to be seen if
this model produces better health out-
comes for the patients. ❙
Rajib Ghosh ([email protected]) is an
independent consultant and business advisor
with 20 years of technology experience in various
industry verticals where he had senior level
management roles in software engineering,
program management, product management and
business and strategy development. Ghosh spent a
decade in the U.S. healthcare industry as part of a
global ecosystem of medical device manufacturers,
medical software companies and telehealth and
telemedicine solution providers. He’s held senior
positions at Hill-Rom, Solta Medical and Bosch
Healthcare. His recent work interest includes public
health and the feld of IT-enabled sustainable
healthcare delivery in the United States as well as
emerging nations.
MASTER OF SCIENCE IN ANALYTICS
• 15-month, full-time, on-campus program
• Integrates data science, information technology and business applications
into three areas of data analysis: predictive (forecasting), descriptive (business
intelligence and data mining) and prescriptive (optimization and simulation)
• Offered by the McCormick School of Engineering and Applied Science
www.analytics.northwestern.edu
MASTER OF SCIENCE IN PREDICTIVE ANALYTICS
• Online, part-time program
• Builds expertise in advanced analytics, data mining, database management,
fnancial analysis, predictive modeling, quantitative reasoning, and web analytics,
as well as advanced communication and leadership
• Offered by Northwestern University School of Continuing Studies
877-664-3347 | www.predictive-analytics.northwestern.edu/info
NORTHWESTERN ANALYTICS
As businesses seek to maximize the value of vast new streams of available data,
Northwestern University offers two master’s degree programs in analytics that
prepare students to meet the growing demand for data-driven leadership and
problem solving. Graduates develop a robust technical foundation to guide
data-driven decision making and innovation, as well as the strategic,
communication and management skills that position them for leadership roles
in a wide range of industries and disciplines.
WWW. I NF OR MS . OR G 32 | A NA LY T I CS - MAGA Z I NE . OR G
Turn big data into
information with
high-performance
analytics
BY PAUL KENT (LEFT TO RIGHT),
RADHIKA KULKARNI AND
UDO SGLAVO
BI G DATA
W
S E P T E MB E R / OCT OB E R 2013 | 33 A NA L Y T I C S
We’re in the era of big data,
but what do we mean by
that? In our view, big data
is a relative, not absolute,
term. It means that the organization’s
need to handle, store and analyze data
(its volume, variety, velocity, variabil-
ity and complexity) exceeds its current
capacity and has moved beyond the IT
comfort zone [1]. Big data is the clas-
sic dual-edged sword – both potential
asset and possible curse. Most agree
that there is signifcant, meaningful,
proprietary value in that data. But few
organizations relish the costs and chal-
lenges of simply collecting, storing
and transferring that massive amount
of data. And even fewer know how to
tap into that value, to turn the data into
information.
Is the enterprise IT department merely
an episode of TV’s “Hoarders” waiting to
happen – or will we actually fnd ways to
locate the information of strategic value
that is getting buried deeper and deeper
in our mountains of data? Quite simply:
What are we going to do with all of this
data?
At its essence, high-performance an-
alytics (HPA) offers a simple, but power-
ful, promise: Regardless of how you store
data or how much of it there is, complex
analytical procedures can still access that
data, build powerful analytical models
using that data, and provide answers
quickly and accurately by using the full
potential of the resources in your com-
puting environment.
With high-performance analytics, we
are no longer primarily concerned with
where the data resides. Today, our ability
to compute has far outstripped our ability
to move massive amounts of data from
disk to disk. Instead, we use a divide-
and-conquer approach to cleverly send
the processing out to where the data
lives.
Ultimately, HPA is about the value of
speed and its effect on business behav-
ior. If the analytic infrastructure requires
a day to deliver a single computational
result, you’re likely to simply accept the
answer it provides. But if you can use
HPA to get an answer in one minute, your
behavior changes. You ask more ques-
tions. You explore more alternatives. You
run more scenarios. And you pursue bet-
ter outcomes.
But how do we bring the power of
high-performance analytics to data vol-
umes of this scale? We believe there are
three basic pillars – three innovative ap-
proaches – to bring HPA to dig data:
• Grid computing: distribute
the workload among several
computing engines. Grid computing
enables analysts to automatically
use a centrally managed grid
WWW. I NF OR MS . OR G 34 | A NA LY T I CS - MAGA Z I NE . OR G
infrastructure that provides workload
balancing, high availability and
parallel processing for business
analytics jobs and processes. With
grid computing, it is easier and
more cost-effective to accommodate
compute-intensive applications
and growing numbers of users
appropriately across available
hardware resources and ensure
continuous high availability for
business analytics applications.
You can create a managed, shared
environment to process large
volumes of programs in an effcient
manner.
• In-database analytics: move the
analytics process closer to the
data. With in-database processing,
analytic functions are executed
within database engines using
native database code. Traditional
programming may include copying
data to a secondary location, and
the data is processed using the
programming language outside the
database. Benefts of in-database
processing include reduced data
movement, faster run-times, and
the ability to leverage existing data
warehousing investments.
• In-memory analytics: distribute
the workload and data alongside
the database. In this approach,
big data and intricate analytical
computations are processed in-
memory and distributed across a
dedicated set of nodes to produce
highly accurate insights to solve
complex problems in near-real
time. This is about applying high-
end analytical techniques to solve
these problems within the in-
memory environment. For optimal
performance, data is pulled and
placed within the memory of a
dedicated database appliance for
analytic processing.
KEYS TO HPA SUCCESS
What does it take to succeed with
high-performance analytics? HPA isn’t
simply an incremental discipline. It in-
volves innovative shifts in how we ap-
proach analytic problems. We view them
differently and continue to fnd new ways
to solve them. It’s more than simply tak-
ing a serial algorithm and breaking it
into chunks. Success requires deeper,
broader algorithms in multiple disciplines
and the ability to rethink our business
processes.
BI G DATA
Help Promote Analytics Magazine
It’s fast and it’s easy! Visit:
http://analytics.informs.org/button.html
S E P T E MB E R / OCT OB E R 2013 | 35
A NA L Y T I C S
In our experience, HPA solutions to
complex business problems require in-
novation along two different dimensions.
First, algorithms and modeling techniques
must be invented and built to exploit the
power of massively parallel computational
environments in three major areas:
• Descriptive analytics. You can
report and generate descriptive
statistics of historical performance
that help you see what has
transpired far more clearly than ever
before.
• Predictive analytics. You can use
data relationships to model, predict
and forecast business results in
impressive ways and predict future
events and outcomes.
• Prescriptive analytics. You can
identify the relationships among
variables to develop optimized
recommendations that take advantage
of your predictions and forecasts and
foresee the likely implications of each
decision option [2].
Second, HPA tools and products must
be built, incorporating these high-perfor-
mance analytics techniques, to enable
the following:
• visualization and exploration of
massive volumes of data;
• creation of analytical models that
use multi-disciplinary approaches
such as statistics, data mining,
forecasting, text analytics and
optimization; and
• application of domain-specifc
solutions to complex problems
that incorporate both specifc
analytical techniques as well as
the business processes to support
decision-making.
What makes HPA so compelling to
businesses across the spectrum – and
makes them willing to undertake this
WWW. I NF OR MS . OR G 36 | A NA LY T I CS - MAGA Z I NE . OR G
fundamental rethinking of analytics – is
the ability to address and resolve trans-
formational business problems that have
the potential to fundamentally change
the nature of the business itself. By pro-
cessing billions of observations and thou-
sands of variables in near-real time, HPA
is unleashing power and capabilities that
are without precedent. Your business
could witness the same results, for ex-
ample, by taking the following steps:
• implementing a data mining tool that
creates predictive and descriptive
models on enormous data volumes;
• using those variables to predict which
customers might abandon an online
application and offer them incentives
to continue their session; and
• comparing these incentives against
one another and the budget, in real
time, to identify the best offer for
each customer.
That’s the kind of emphatic value that
HPA can provide and why it’s continuing
to garner the attention of many enterpris-
es today.
CONCLUSION
Amazingly, the discipline of high-per-
formance analytics continues to move for-
ward at a rapid pace. As storage gets even
more affordable and greater amounts of
processing power become ever-cheap-
er, it’s easy for us to envision “analytical
streaming” in real time where insights are
not discrete events but are part of the min-
ute-by-minute operation of the enterprise,
woven into the fabric of every meaningful
business process. Moving further down
the cost curve will enable us to further de-
mocratize analytics and move it beyond
the specialized analyst and into the hands
of virtually every employee, increasing the
breadth and depth of the value. By push-
ing out the power of this style of HPA, we
have the opportunity to achieve exponen-
tially outsized gains driven by new levels
of rapid analysis. ❙
Paul Kent is the vice president of Big Data at SAS.
Radhika Kulkarni is vice president of Advanced
Analytics R&D at SAS and a senior member of
INFORMS. Udo Sglavo is principal analytical
consultant at SAS. This article was excerpted and
adapted from the chapter “Finding Big Value in Big
Data: Unlocking the Power of High-Performance
Analytics” by the authors in “Big Data and Business
Analytics,” edited by Jay Liebowitz, ©2013 Taylor &
Francis Group LLC. Reprinted with permission.
BI G DATA
REFERENCES
1. For more information, visit: http://www.sas.
com/big-data/index.html
2. For more information, visit: http://www.
informs.org/Community/Analytics
Subscribe to Analytics
It’s fast, it’s easy and it’s FREE!
Just visit: http://analytics.informs.org/
WWW. I NF OR MS . OR G 38 | A NA LY T I CS - MAGA Z I NE . OR G
The information asymmetry problem: How decision
science can help reduce market inefficiency.
Bridging the
information gap
BY KRISHNA RUPANAGUNTA,
AJAY PARASURAMAN AND
SOURAV BANERJEE
BEHAVI ORAL ECONOMI CS
T
S E P T E MB E R / OCT OB E R 2013 | 39 A NA L Y T I C S
That information is integral
to any economic trans-
action has always been
generally accepted; how-
ever, the degree to which information
infuences the outcome of a transaction,
especially when it is between human be-
ings (as opposed to rational economic
agents), is a relatively more recent
discovery.
Classical economics is based on the
perfectly competitive general equilibri-
um model in which every party involved
in a transaction has access to exactly
the same information. This model also
assumes that any new information is
rapidly disseminated to all the parties
on the demand and supply sides, which
takes the market back to equilibrium.
This traditional model was based on
the concept of a fundamentally rational
decision-maker, which was evidently not
true. This paradigm was challenged frst
by the neo-classical economists, who
attempted to integrate psychology into
economics; however, they continued to
assume that people were focused on
maximizing utility. In the 1960s, a whole
new breed of economists challenged
this paradigm by drawing upon how peo-
ple irrationally seek satisfaction, rather
than maximizing utility. And behavioral
economics, which integrated cognitive
psychology with economics, was born.
One of the more interesting areas of
study in behavioral economics in recent
times has been how unequal access to
information infuences the outcomes of
transactions.
INFORMATION ASYMMETRY:
WHAT IS IT?
One commonly accepted defnition
of information asymmetry is: “A situation
in which one party in a transaction has
more or superior information compared
to another. This often happens in trans-
actions where the seller knows more
than the buyer, although the reverse can
happen as well. Potentially, this could
be a harmful situation because one par-
ty can take advantage of the other par-
ty’s lack of knowledge” [1]. Information
asymmetry can prevent consumers in a
market from taking fully informed deci-
sions, which in turn can result in market
ineffciencies, or in the worst case, in
market failure.
A recent example of how informa-
tion asymmetry could end up creating a
Black Swan event with disastrous conse-
quences was the 2008 subprime crisis.
Help Promote Analytics Magazine
It’s fast and it’s easy! Visit:
http://analytics.informs.org/button.html
WWW. I NF OR MS . OR G 40 | A NA LY T I CS - MAGA Z I NE . OR G
As banks started relaxing the credit requirements
in their quest to get a larger slice of the mortgage
market, sub-prime borrowers gamed the system by
hiding information that otherwise would have dis-
qualifed them. This quickly turned into a negative
spiral as banks/housing fnance companies tried to
out-do each other in chasing these high-risk borrow-
ers. The Adverse Selection cycle set in quickly as
riskier borrowers piled into the system. This giant
Ponzi scheme fnally blew up spectacularly in 2008 –
information asymmetry was at the heart of this crisis.
This is clearly an extreme case but serves as a
powerful illustration of how important information is
to keep the wheels of modern capitalism in motion.
Government regulators as well as private organiza-
tions recognize this and are constantly making ef-
forts to minimize ineffciencies caused by information
asymmetry in the marketplace. Governments man-
date pharmaceutical companies to publish the risks
and side effects associated with drugs. Likewise,
private companies take it upon themselves to better
inform buyers by providing as much information as
possible – e-commerce sites providing information
to buyers through product reviews and ratings are
examples.
Even in a world where technological enhance-
ments have dramatically reduced barriers to informa-
tion availability, information asymmetries continue
I NFORMATI ON ASYMMETRY
Request a no-obligation INFORMS Member Benefits Packet
For more information, visit: http://www.informs.org/Membership
This is clearly an extreme
case but serves as a
powerful illustration of
how important information
is to keep the wheels
of modern capitalism
in motion. Government
regulators as well as
private organizations
recognize this and are
constantly making efforts
to minimize inefficiencies
caused by information
asymmetry in the
marketplace.
A NA L Y T I C S
to exist in various forms in almost every
business transaction. In a broad sense,
asymmetric information is manifested in
two major categories: adverse selection
and moral hazard.
Adverse selection: Asymmetric in-
formation before a transaction, which
leads to the less informed party se-
lecting “bad” products or services. In-
surance companies are exposed to
the risk of adverse selection since it is
very diffcult to assess the true risk of
every customer. This constraint often
forces them to offer products at prices
that could end up attracting high-risk
customers and driving away lower-risk
ones. Banks also face similar risks; they
may end up having a signifcant propor-
tion of high-risk customers in their loan
books.
Moral hazard: A situation resulting from
asymmetric information where the more
informed party misuses private informa-
tion for unethical behavior. For instance,
in e-commerce transactions, buyers and
sellers are physically removed from each
other. Dishonest sellers may not divulge
details about the product being sold, and
the buyer may be sold an inferior prod-
uct lacking in quality. On the other hand,
in online marketplaces, which are buyer
biased and where the onus of proof is
WWW. I NF OR MS . OR G 42 | A NA LY T I CS - MAGA Z I NE . OR G
on the seller, the buyer may make false
claims of receiving a damaged item with
the intention of manipulating the seller to
lower prices.
Both these situations present a chal-
lenge to businesses, which constantly
struggle to mitigate risks and potential
losses arising from information asym-
metry. Businesses tackle these prob-
lems using a combination of heuristic
and fact-based decision-making. For ex-
ample, most insurers now have sophis-
ticated risk evaluation frameworks that
help them alleviate the risks of adverse
selection. E-commerce sites are con-
stantly innovating and upgrading their
infrastructure to provide more and more
correct information to both buyers and
sellers on the website. This is where ana-
lytics and behavioral sciences can come
together as decision science to solve this
problem.
ENTER DECISION SCIENCE
With the advent of newer technolo-
gies and the ability to store and pro-
cess vast amounts of data, decision
science is a fast emerging discipline
that offers solutions leveraging analyti-
cal techniques and combines them with
concepts from behavioral psychology.
For instance, adverse selection is a par-
ticularly thorny problem in the world of
insurance. The Progressive Insurance
Company discovered a correlation be-
tween fnancial responsibility (or lack
of it) and reckless driving. This nugget
helped them better tailor their insurance
products.
The information explosion (data from
social media, clickstream and telematics
being a few examples) combined with
advances in technologies (big data, high
performance computing) offers compa-
nies the ability to solve some of these
problems.
In most situations in the insurance
industry, information asymmetry exists
in the form of adverse selection. Though
most analysts faithfully adopt the use of
credit scores to assess the risk of poten-
tial customers, a certain limit on accura-
cy still persists due to information gaps
such as a lack of complete knowledge
surrounding the characteristic behav-
ior of an insurance buyer. Some of the
more savvy insurance frms are trying to
bridge this information gap by triangu-
lating with other data sources to extract
better behavioral signals about their cus-
tomers. For instance, auto insurers are
collaborating with auto manufacturers
I NFORMATI ON ASYMMETRY
Subscribe to Analytics
It’s fast, it’s easy and it’s FREE!
Just visit: http://analytics.informs.org/
WWW. I NF OR MS . OR G 44 | A NA LY T I CS - MAGA Z I NE . OR G
to capture telemetry data captured in cars that re-
fect driving behavior and use that to refne the risk
profles.
Data captured from a user’s activities online has
become a goldmine of information for understand-
ing customer behavior that normally cannot be
fgured out from merely the transactional metrics
and demographics. As President Obama’s election
campaign team demonstrated in the 2012 elections,
micro-targeting and real-time monitoring helps in
keeping track of behavioral changes in the voter.
Insurance companies can apply similar techniques
to fne-grain the risk profle of every customer, tailor
a solution that best fts the customer and even con-
tinuously update the risk profle based on actions
taken by customers to develop tailor-made insur-
ance solutions; and in the process reduce the in-
formation gap that is currently resulting in a higher
risk exposure.
The moral hazard problem has probably been
around for as long as the market economy has ex-
isted. The online marketplace is turning out to be a
test-bed where the moral hazard problem can be min-
imized by the use of decision science.
In normal buyer–seller transactions, it is ex-
tremely diffcult to identify manipulative buyers who
make false claims of delayed shipping or broken
items and cause dissatisfaction to honest sellers.
I NFORMATI ON ASYMMETRY
The moral hazard problem
has probably been around
for as long as the market
economy has existed. The
online marketplace is
turning out to be a test-
bed where the moral
hazard problem can be
minimized by the use of
decision science.
Join the Analytics Section of INFORMS
For more information, visit:
http://www.informs.org/Community/Analytics/Membership
S E P T E MB E R / OCT OB E R 2013 | 45
A NA L Y T I C S
This moral hazard problem can end up
driving honest sellers from the market-
place, clearly not the best solution.
The online world offers a chance
to eliminate this ineffciency simply by
storing historical transactions and le-
veraging them to draw patterns. For in-
stance, looking at historical transactions
and triangulating this with other sourc-
es (e.g., customer service transcripts),
it is possible to predict the likelihood of
a fraudulent buyer behavior and proac-
tively alert the seller. Moreover, a buyer
level-rating can be devised and used to
better inform the sellers. One leading
technology platform has implemented a
text mining solution to sift through chat
transcripts for early-warning signals of
bad buyer behavior. This information is
then used to proactively warn the sell-
ers, alerting them to the potential for
morally hazardous behavior from the
buyers.
SEPARATING OUT THE SIGNAL
FROM THE NOISE
Information asymmetry has contrib-
uted to creating market ineffciencies in
a wide variety of industries, and com-
panies have been trying to fnd ways to
bridge this gap for a long time. The re-
cent advances in decision science have
created opportunities for companies
to narrow this and inch toward a more
perfect marketplace that minimizes the
potential to create rent-seeking behavior
in economic transactions. The availability
of data is now making it possible to solve
some of the adverse selection and moral
hazard problems.
This comes with a caveat: where
there is data, there is also too much of
it. And sifting the signal from the noise
is another challenge altogether. Winning
companies will be the ones who manage
to extract the signals and use them to
remove ineffciencies caused by informa-
tion asymmetry. ❙
Krishna Rupanagunta is responsible for client
delivery, people development and providing thought
leadership across projects at Mu Sigma. With
more than 14 years of experience, Rupanagunta
has a strong background in business consulting,
servicing Fortune 500 clients across multiple
industries with specifc focus on supply chain
optimization. Prior to joining Mu Sigma, he was part
of a non-proft that helped the Indian government
in the conceptualization and design of the largest
citizen identity project ever attempted in the world.
He has a master’s degree from IIM - Calcutta.
Ajay Parasuraman is a senior business analyst
at Mu Sigma and is based in Bangalore, India. He
has several years of experience in data analysis
across industry verticals.
Sourav Banerjee is a senior manager at Mu
Sigma with vast experience in analytics consulting
with multiple Fortune 500 clients. His experience
spans across multiple industries – insurance,
technology, telecom, retail and healthcare across
multiple geographies.
REFERENCES
1. http://www.investopedia.com/terms/a/
asymmetricinformation.asp
WWW. I NF OR MS . OR G 46 | A NA LY T I CS - MAGA Z I NE . OR G
PAW founder Eric Siegel discusses the power of
predictive analytics, privacy issues, his new book and
what the future may hold for analytics professionals
and consumers.
Predictive analytics
power player
BY PETER HORNER
Q&A
S E P T E MB E R / OCT OB E R 2013 | 47 A NA L Y T I C S
Eric Siegel, founder of
Predictive Analytics World
(PAW, a series of confer-
ences held throughout the
year in major U.S. and European cites)
and the author of the new book, “Predic-
tive Analytics: The Power to Predict who
will Click, Buy, Lie or Die,” is without ques-
tion a key player in the ongoing global
analytics movement that’s transforming
the way organizations conduct business.
The college-professor-turned-entrepre-
neur recently shared his perspective on
the dynamic analytics world with Analyt-
ics magazine. During a 30-minute chat,
we talked about several issues raised in
the book including privacy concerns in
the big data era, PAW and his prediction
for the future of predictive analytics. Fol-
lowing are excerpts from the interview.
Tom Davenport and Jeanne Harris’
2007 book, “Competing on Analytics,”
brought analytics to the attention of
the business world on a mass scale
and launched a wave of analytics-ori-
ented books. What motivated “Predic-
tive Analytics”?
I wanted to bring the concepts of
predictive analytics to anybody and ev-
erybody who might be interested in the
remotest sense of the word, including
people who aren’t even business con-
sumers let alone technical practitioners.
I strove to write a book that, although
informative and conceptually coherent
and comprehensive, is accessible. It’s an
intro textbook disguised as a fun, enter-
taining, pop science book. I don’t mean
in any way to diminish the book’s value to
people who are new to the feld and are
interested in making use of this technol-
ogy – business users, business readers
and prospective technical practitioners
– because it very much covers the main
concept behind predictive analytics, how
it’s used and how it works.
You don’t often see the words “analyt-
ics” and “fun” mentioned in the same
sentence, but you see it in your book.
What is fun about analytics?
Predictive analytics is an incarnation
of machine learning where the computer
learns from experience, from data. It is
learning how to predict, and the science
involved is gee-whiz cool.
It’s compelling from a philosophical
standpoint in terms of what the challenge is
and what it means to actually discover new
knowledge from historical data that will ap-
ply in the future under new circumstances.
That is really interesting, fun stuff. There’s
no reason that shouldn’t be explained in a
way that anyone could understand exactly
what’s going on and why it’s so cool.
On the business side, it’s exciting be-
cause it’s so valuable. It is, in a sense, the
E
WWW. I NF OR MS . OR G 48 | A NA LY T I CS - MAGA Z I NE . OR G
holy grail for all sorts of applications in
marketing, fraud detection, credit scoring
and outside of business in government,
law enforcement, education, nonprofts,
even presidential campaigns.
The value is so strong because the
ability to make predictions for each indi-
vidual directly informs operational actions
and decisions, so this is the ultimate of
data-driven decision-making per individu-
al. The prediction for the individual directly
informs how to treat or contact or whatev-
er action to take with that individual. That’s
extremely valuable. It’s fun because the
science is really cool and it’s amazing to
explore, and because the resulting value
is changing the world and irrefutable.
The subtitle of your book, “The power
to predict who will click, buy, lie or
die,” is provocative and, in a way, trou-
bling since it conjures up visions of
Big Brother and privacy concerns as-
sociated with big data and analytics.
That is a semi-humorous subtitle to
signal the reader that, hey, this is not
your traditional business book … but
certainly predicting when you will die is
one of the places where ethics come
up in civil liberty issues such as privacy.
In general, the biggest negative contri-
bution that predictive models have for
privacy is that in addition to what data
should be shared to what degree and
what data should be considered sensi-
tive, we are now introducing the power
to infer new data that can be even more
sensitive such as: When are you going to
die? Are you likely to get pregnant? Are
you going to quit your job? Are you going
to commit a crime again if we release you
from prison?
These things are extremely sensitive,
and the way they are acted upon needs
to be carefully monitored. In general, it’s
very hard to delineate hard and fast lines
of where things start to become prob-
lematic and where they become ethi-
cally questionable. However, the frst and
most important thing we can do is spread
the word. The world at large needs to see
exactly what’s going on. This new power
is very valuable, but it also has this risk
associated with it.
Given all of the competitive advantag-
es analytics offers, why are so many
organizations still reluctant to jump
on the analytics bandwagon?
Predictive analytics is a relatively
complex initiative, particularly if an or-
ganization has never used it before. My
book is trying to help change the percep-
tion that it is overly complex, but there
are challenges. It’s a new concept, it’s a
new way of doing things, there’s a cer-
tain amount of inertia to overcome and
resources need to be carved out in order
PREDI CTI VE ANALYTI CS
S E P T E MB E R / OCT OB E R 2013 | 49
A NA L Y T I C S
to take that frst step. It doesn’t happen
overnight.
It’s been very exciting over the last
several years to watch analytics explode,
but it’s still just the tip of the iceberg.
There’s still so much untapped potential.
Everyone’s excited about big data, and
there’s no question that data is exploding
like mad, but that’s sort of easy. All you
have to do to explode your data is to not
delete it, which is sort of a no-brainer be-
cause the data is so cheap to store.
The hype around big data, however,
doesn’t directly address what the actual
value is and what the purpose of the data
is. One of the most actionable things you
can get from data is to learn from it to
predict behavior at the individual level.
That’s predictive analytics. So given all
that hype over big data and the incentive
to make use of it and leverage the data,
predictive analytics is a key way to do so,
and it is really taking hold in many sec-
tors at a breakneck pace.
Over the last 15 or 20 years, the corpo-
rate world has tried to catch and ride oth-
er big, so-called game-changing waves
such as “re-engineering” and “enter-
prise resource planning” with mixed re-
sults. What makes analytics different?
That’s a good question. The exam-
ples you mentioned had to do with infra-
structure – having a larger disk drive and
how corpora-
tions remem-
ber things.
Analytics isn’t
an engineer-
ing thing; it’s
science in the
sense that it’s
about con-
tent; it’s about
substance; it’s
about applying
certain kinds of
math; it’s about
fnding mean-
ing in the data. The data is not just about
a bunch of boring ones and zeros; it’s a
recording of business history.
Of course, corporations need engi-
neering and infrastructure to keep and
store the data and to be able to access it.
With analytics, though, we fnd out what
the data is actually telling us and what we
can learn from it, and that speaks directly
to the heart of improving the mass-scale
operations that organizations conduct.
The way – the ultimate way – to improve
those operations is by guiding them with
predictions on a per-individual basis.
The number of analytics-oriented
conferences are popping up almost
as fast as the number of analytics-
oriented books. How do you keep
Eric Siegel, college
professor-turned
entrepreneur, is the
author of the new
book, “Predictive
Analytics.”
WWW. I NF OR MS . OR G 50 | A NA LY T I CS - MAGA Z I NE . OR G
PAW conferences fresh and relevant
in such a competitive, fast-changing
environment?
It’s been astonishing just how many
big data and other kinds of analytics con-
ferences have popped up in the last two
or three years. We launched Predictive
Analytics World in February 2009, and
the conference now takes place seven
times a year and that number will prob-
ably increase in 2014, including confer-
ences in Canada and Europe.
We have a small number of repeat
speakers who are just amazing – you
might call them rock star consultants –
and we attract many interested brand-
name practitioners who are at the top of
their craft and have great stories to tell
from their organizations. The program is
extremely rich, but I think the main dif-
ferentiator from the other events is that
we’re focused very specifcally on pre-
dictive analytics. Very few other events
have attempted to compete directly with
us in that way. I think there are a lot of
people out there who are interested
in that sort of clear focus on predictive
analytics rather than the broader realm of
analytical methodologies.
As the head of PAW, you’ve no doubt
observed dozens if not hundreds of
organizations and their respective
analytical expertise. If you had to pick
just one organization that “really gets
it,” an organization that best uses the
power of analytics for competitive ad-
vantage, who would it be?
I’ve seen many, many such examples.
In marketing there’s everyone from Tar-
get on down to small organizations like
Vermont Country Store. There’s Harrah’s
Las Vegas, which Tom Davenport made
famous for their analytical methods.
There’s Fed Ex and all the large cell
phone companies, and that’s just within
marketing applications. An insert in my
book provides 147 examples across nine
sectors, including fnancial risk, health-
care, crime fghting, government, educa-
tion, etc.
As to which organization is doing the
best, frankly my focus has been on fnd-
ing juicy individual case studies rather
than evaluating an organization’s overall
analytical performance. I think there’s a
need for that type of thing, perhaps some
kind of award that’s given to a company
for general analytical success in a particu-
lar sector. I see that as a worthy exercise,
but it’s not something I have studied in a
PREDI CTI VE ANALYTI CS
Help Promote Analytics Magazine
It’s fast and it’s easy! Visit:
http://analytics.informs.org/button.html
WWW. I NF OR MS . OR G 52 | A NA LY T I CS - MAGA Z I NE . OR G
PREDI CTI VE ANALYTI CS
coherent way. I can say there are a lot of
very talented analytical professionals do-
ing a lot of great work in many, many orga-
nizations across many different sectors.
Last question. What’s your prediction
for the future of predictive analytics?
Two main things are going to continue
to happen. Predictive analytics is going to
penetrate further and become more per-
vasive. Even within the well-trodden ap-
plications areas – marketing, credit risk,
fraud detection – there is so much more
that can be done and it is constantly ex-
panding. That’s more of a quantitative
difference.
The place where there is a qualitative
difference is where organizations predict
something new that you might have not
thought of as a value proposition. For ex-
ample, Google uses a predictive model to
help inform the ranking of search results.
That might be a no-brainer since that’s the
whole point of Google’s value to the user.
However, Google’s revenue comes from
ads. They also separately predict, on be-
half of their advertisers, which new ad that
hasn’t been tested yet is most likely to
have perceived low quality and get a high
bounce rate. So there are all sorts of be-
haviors that can be predicted and new val-
ues propositions that can be derived there.
The second prong is predictive analyt-
ics is going to become increasingly more
apparent and visible to the end user. The
end-user will perceive value in being pre-
dicted, and that, as a consumer, it is ac-
tually helping you.
There are places where that exists
now. Netfix predicts which movies you’re
going to like, Amazon on books and Pan-
dora on music. In addition to products
recommendations, spam flters have
been greatly improved so the spam prob-
lem has been largely solved. The junk
mail problem is not going to be entirely
stopped because it’s a numbers game
the marketers are playing, but it will be
alleviated because it’s a win-win when
marketers can predict those people who
are unlikely to respond and say, let’s not
send them what they will perceive as
junk mail.
You are starting to see consumers
and citizens become more and more
aware of the power of predictive analyt-
ics. I included in the book as an afterword
10 predictions for the frst hour of 2020
– you’re driving to work and 10 different
things happen, and they are all assisted
by predictive analytics. The main thing is
connecting the technology – connecting
your smart phones to your car, for exam-
ple. All sorts of things are happening now
that just need to be integrated in order to
help you in your daily life. ❙
Peter Horner ([email protected]) is
the editor of Analytics magazine.
WWW. I NF OR MS . OR G 54 | A NA LY T I CS - MAGA Z I NE . OR G
Information systems and technology are messy, which
makes applying analytics an arduous task.
Messy analytics?
It’s OK. We’re
housewives!
BY GARY COKINS
TECHNI CAL & MANAGERI AL PROBLEMS
S E P T E MB E R / OCT OB E R 2013 | 55 A NA L Y T I C S
Who is working with perfect
systems? That is like asking
if your house or apartment is
always clean and tidy to be
presentable to guests. In your organization,
who cleans up messes? Who fxes things?
I am writing on this topic of messes
based on a cafeteria-style breakfast I had
in a hotel in Estonia where I was present-
ing a seminar. It was the peak morning
rush, and all the tables were flled with
diners. A couple that was seated next to
me had just departed. Two women who I’d
estimate were in their late 40s eyeballed
the two vacant chairs. One made a silent
inquiry, “Are these seats available?”
I promptly answered, “Yes, but it’s a
mess” and pointed to the soiled dishes, cof-
fee cups and silverware. One of the ladies
immediately replied, “It’s OK. We’re house-
wives.” They then picked up the soiled
dishes and carried them to a nearby cart.
After she replied, I had to laugh out
loud. Her reply communicated so many
things to me. It meant, “We are accus-
tomed to cleaning up messes left behind
by others.” It also meant, “Someone has
to do it,” especially when it is urgent. It
also meant, “If you have been cleaning
up most of your life (and I suggest the
reader replace “life” with “career”), then
you are resigned to the reality that there
will always be a mess or problem that will
need to be addressed, cleaned or fxed.”
WHAT TYPE OF MESS ARE YOU
CLEANING?
In information systems and technol-
ogy (IS/IT) the messes are on three lev-
els: technical, analytical and managerial.
I will describe them soon, but frst let’s
note that there are two types of IS/IT
cleaning and maintenance staff.
1. Hardware and infrastructure profes-
sionals “cleaners.” Candidly, I am not
knowledgeable enough to understand
what they do and what their jobs entail. I
suspect it is complicated. To me it is like
they are down in the boiler room of an
ocean sailing ship. To oversimplify their
job, it is to keep the lights on for the users.
2. Analyst “cleaners.” Their job is to
convert data into information with the
purpose to provide insight and foresight
for better decisions and actions.
Both types are important. They serve
different purposes.
When you read my classifcation of
“mess” levels that follow, which of the
three levels do you believe is the most
damaging to an organization’s perfor-
mance and success? Here they are:
• The technical mess. Technical
messes involve disparate data
sources, a patchwork of purchased
hardware and software with
predictable incompatibility problems
W
WWW. I NF OR MS . OR G 56 | A NA LY T I CS - MAGA Z I NE . OR G
and computing/storage capacity
issues. Down in the IS/IT “boiler
room” there are lots of hammers and
wrenches to fx things.
• The analytical mess. In contrast
to the technical mess, which is
typically complicated to deal with, the
analytical mess is complex. (To learn
the difference between complicated
and complex in this context, see
[1].) In summary, complicated
systems have many moving parts,
like a wristwatch with gears; but
they operate in patterned ways. In
contrast, complex systems have
patterned ways, but the interactions
– think variables – are continually
changing. In the former, one can
usually predict outcomes. The math
may be easy with linear relationships.
In the latter, such as air traffc control,
weather and aircraft maintenance
delays cause changes in the constant
interactions with numerous variables.
Analyst messes are partly caused
by technical messes. An example is
defective and/or incomplete source
data – the dirty data mess. But even
if the analyst has the luxury of perfect
input data quality (like a germ-free
hospital surgery room or semiconductor
chip clean room environment), they
will still encounter a problem framing
the mess. A key aspect of an analyst’s
job is to correctly frame the problem
they are trying to solve. Solutions to
problems do not always require nails,
so an analyst needs more in their tool
belt than a hammer.
The good news for analysts is
that high-performance analytics
and visualization software is now
leveraging massively high-speed
computing and storage technology.
The combination is like a powerful
cleanser. With this advance in
technology, instead of the analyst
framing a problem with a carefully
thoughtful hypothesis to test variables
that used to require hours of computer
time, today it can be run in seconds.
This means that analysts can more
quickly test and learn. Better yet, they
can quickly test, fail and learn. Failing
to validate a hypothesis no longer has
the adverse consequences it once had
in terms of causing delays. Analysts
can now fail so quickly that no one
sees the brief mess they made.
• The managerial mess. Senior
management can create a mess that
is more diffcult to clean compared to
a technical or analytical mess. This is
because the mess they create comes
from their minds and attitudes.
The managerial mess has
two broad categories: power and
incompetence.
MESSY ANALYTI CS
WWW. I NF OR MS . OR G 58 | A NA LY T I CS - MAGA Z I NE . OR G
The mess caused by power
typically involves such a high
reliance on intuition, gut feel and
past experience that managers
believe they can get by without
fact-based information and deep
analysis. Confrmation bias further
muddies the foor when executives
nudge the analysts to twist the
fndings to satisfy the executive’s
pre-conceived notion of what is the
correct answer.
Messes caused by incompetence
typically involve the inability to see
or understand what the analyst’s
fndings imply for decisions
and actions. For example,
when a marketing analyst for a
telecommunications company
examines more than a thousand
variables to determine what types
of optimal deals might best be
offered to maximize profts from
differentiated types of customers, the
fndings may be counter-intuitive –
yet valid. That is because analytics
deals with complexity, not just
complications.
Both managerial mess categories
can be addressed, but it requires
fortitude by employees to manage
their managers; that is, to convince
managers of the follies of their
management style.
CLEANING SOLUTIONS FOR MESSES
Technical messes can be cleaned
with good, advanced capital investment
planning for purchasing (or now with
SaaS, renting) the correct hardware and
software. Also educate, educate, edu-
cate; and train, train, train.
Analytical messes can be cleaned
with good data governance practices
(e.g., extraction, transform and load) and
with high-performing analytics software
with visualization to accelerate the ana-
lyst’s experimentation and investigation.
Managerial messes are more prob-
lematic. The stains in the carpet are deep.
After the long, cold winter, we all at-
tempt to do spring cleaning. I submit that
following the global economic collapse
of 2008 many organizations are on the
road to recovery. They experienced their
winter of the economic cycle. It is time to
clean up messes.
We need to behave like the two
breakfast diners I met who said, “It’s OK.
We’re housewives.” Get on with the task.
Get your IT house in order. The popular-
ity of analytics, operations research tech-
niques and big data is quickly growing.
Having messes will slow down your prog-
ress to leverage and deploy analytics. ❙
Gary Cokins ([email protected]) is
president of Analytics-Based Performance
Management LLC in Cary, N.C. He is a member of
INFORMS.
MESSY ANALYTI CS
When the Institute for Op-
erations Research and the
Management Sciences (IN-
FORMS), the publishers of
Analytics magazine, set out to establish
the frst-of-its-kind analytics certifcation
program, the organization had two constit-
uents in mind: analytics professionals and
the organizations that hire them. Along
with helping analytics professionals boost
their careers and elevate themselves in
the marketplace with the Certifed Analyt-
ics Professional (CAP
®
) designation, the
program was also designed to help hiring
managers identify qualifed analytical tal-
ent. Less than six months after its offcial
WWW. I NF OR MS . OR G 6 0 | A NA LY T I CS - MAGA Z I NE . OR G
Candid comments underscore successful launch of
Certified Analytics Professional (CAP®) program.
launch, the CAP
®
program is apparently
succeeding on both fronts.
To ascertain the initial response and
potential impact of the CAP
®
program, IN-
FORMS Marketing Director Gary Bennett
and Analytics Editor Peter Horner devised
a simple, three-question e-mail and sent it
out to a random sample of the 49 individu-
als who had earned the CAP
®
designation
as of Aug. 1. The survey was not scientifc;
the authors just wanted to take the pulse
of the newly created CAP
®
community and
garner feedback. The three questions:
1. Why did you pursue certifcation?
2. How did you prepare for the exam?
3. How do you expect certifcation to
How will analytics
certification impact
your career?
COMPILED BY GARY BENNETT AND PETER HORNER
W
CAREER ADVANCEMENT
S E P T E MB E R / OCT OB E R 2013 | 61 A NA L Y T I C S
impact your career going forward?
The insightful feedback follows:
DON BUCKSHAW,
CAP®
Chief Operations
Research Analyst
SAIC
Why did you pursue certifcation? The
certifcation was a nice way to fnish a
three-year effort to reinvent myself as a
data scientist instead of an operations
research analyst and demonstrate that I
had achieved my goal.
How did you prepare for the exam?
The exam has three subject areas. The
frst had an operations research slant with
topics such as optimization and discrete-
event simulation. The second had a data
analytics slant with topics such as data
mining and predictive models. The third
involved lifecycle project management,
including problem framing and model sus-
tainment. I was comfortable with the oper-
ations research material after working 15
years for an operations research support
organization. I was also in the last semes-
ter of Northwestern University’s Predictive
Analytics Master’s program, so I was well
prepared for data analytics. Helping to
teach an INFORMS Soft Skills Workshop
in the past prepared me for the project
management portion. But I still reviewed
the study guide and studied rusty areas.
How do you expect certifcation to
impact your career going forward?
Taking the test was a personal goal that
provided tangible evidence to prove I had
successfully transformed myself. It is too
early to tell if the certifcation will become
important to the community or employers,
but it is a nice differentiator. The CAP
®

provides an instant level of credibility with
customers who want to start a new analyt-
ics initiative and need guidance.
Eleven more earn CAP®
designation
INFORMS congratulates the following
individuals who comprise the latest
group to earn the Certifed Analytics
Professionals (CAP
®
) designation,
bringing the total to 49 as of Aug. 1 since
the frst exam was offered in April:
Mark Colosimo (Detroit)
Philip Fry (Chicago)
Laurie Garrow (Atlanta)
Mario Rappi (Wichita, Kan.)
Herbert Hackney (Vienna, Va.)
Eugene Hahn (Salisbury, Md.)
David Johnson (Alexandria, Va.)
Timothy Lortz (Laurel, Md.)
Harrison Schramm (Arlington, Va.)
Jose Tejeda (Laurel, Md.)
Chen-hang Yen (Fairfax, Va.)
WWW. I NF OR MS . OR G 62 | A NA LY T I CS - MAGA Z I NE . OR G
THOMAS W.
CHESNUTT, CAP®
CEO
A&N Technical
Services, Inc.
Why did you pursue certifcation? Cer-
tifcation is an important broadening of the
understanding of analytics and its associ-
ated skill set. Understanding how to better
communicate dispersion and improve un-
derstanding of risk are necessary ingredi-
ents for improving the transparency and
robustness of decision-making.
How did you prepare for the exam? I
ordered copies of most of the textbooks
mentioned, took the sample test (list-
ed areas of least comfort), studied the
books (with more time in less familiar ar-
eas), listed unfamiliar jargon for follow-
up and revisited old texts. Pursuit of the
unknown opened up new potential appli-
cations for my practice area in improving
effciency in the water industry.
How do you expect certifcation to im-
pact your career going forward? Be-
sides improving professional relationships
with other peers in analytics, I have had
clients inquire as to what those funny let-
ters mean. It has opened pathways to new
conversations that I might not have had
in the past. Quite frankly, however, I went
through the CAP
®
certifcation primarily for
intrinsic and not extrinsic reasons.
BETH NIELSEN,
CAP®
Operational Analytics
Manager
Quintiles
Why did you pursue certifcation? I
love the idea of a certifed analytics pro-
fessional; that an independent group has
verifed that I have the education and
experience to solve analytical business
questions, and also that I have the soft
skills needed to explain what I did!
How did you prepare for the exam? I
looked over the sample questions to make
sure that I understood all of the concepts
and could easily answer the questions. For
a few of the questions I researched the ter-
minology. I knew the concepts, but wanted
to make sure that I didn’t waste time or
misunderstand a question due to wording.
While taking the actual exam, I asked my-
self: If this situation came up at work, what
would I do? I felt confdent with my answer
when framing the question in this way.
How do you expect certifcation to im-
pact your career going forward? I have
hopes that the certifcation will positively im-
pact my career since it is an independent
CERTI FI ED ANALYTI CS PROFESSI ONAL
S E P T E MB E R / OCT OB E R 2013 | 63
A NA L Y T I C S
assessment of knowledge and understand-
ing of business analytics. It can be used
as a marketing tool with our customers to
show that we take analytics seriously. Since
analytical skills are required throughout all
industries, it’s important to have a gold stan-
dard for the business analytics feld.
K. MATTHEW
WINDHAM, CAP®
Director of
Analytics
NTELX Inc.
Why did you pursue certifcation? I
chose to pursue the CAP
®
in order to en-
hance my stature in the analytics com-
munity. All prior certifcations in the feld
have been vendor-centric, and thus tied
to specifc software skills rather than deep
analytics knowledge. Therefore, this lev-
el of independent, academically rigorous
certifcation was desperately needed in
the industry. So, when I learned of the IN-
FORMS effort to create the certifcation, I
couldn’t wait to sit for the exam and dem-
onstrate my expertise in analytics.
How did you prepare for the exam?
As a professional in the feld, I fnd my-
self constantly reading to stay abreast of
this exciting and ever-evolving industry.
So, preparing for the CAP
®
was really no
different, except for the favor of material
selected. The sample exam provided on-
line included references for each segment
of material covered. I reviewed each of
the ones I already owned, and purchased
and read those I didn’t have. Tests such
as the CAP
®
are diffcult not because of
the depth of knowledge in one area be-
ing tested, but instead due the breadth of
material being covered. Staying current in
analytics means to be constantly learning
something new, as well as keeping past
knowledge fresh, which is exactly what
the CAP
®
exam forces you to do.
How do you expect certifcation to im-
pact your career going forward? The
CAP
®
already denotes a level of expertise
in analytics that allows me to stand out
among my peers. As the credential gains
awareness, I expect this will continue
and make my services more in-demand
than ever before. I also anticipate this will
become the gold standard by which ana-
lytics practitioners are measured, thus
raising the bar across the profession.
STEVEN HARROD,
CAP®
Assistant Professor
University of
Dayton
Why did you pursue certifcation? I
pursued CAP
®
certifcation to expand my
about myself. In addition, it demonstrates
my versatility beyond pure research. Cer-
tainly, it allows me to tell my students, “I
have been there.”
RAMI MUSA, CAP®
Operations
Management
Consultant
Dupont
Why did you pursue certifcation?
I have been working in the feld of op-
erations research analytics for years
and having the stamp of INFORMS is
a great validation of my knowledge and
expertise. It will also help me establish a
connection with well-established profes-
sionals in the feld from different back-
grounds and industries.
How did you prepare for the exam?
I read the handbook completely (small
booklet) and reviewed it with a colleague
of mine. We exchanged opinions and
thoughts about what we know and what
not. I also read some references (books
and online sources) for some concepts
that were brought up in the handbook I
was either unfamiliar with or have a little
background on the subject.
How do you expect certifcation to im-
pact your career going forward? It will
WWW. I NF OR MS . OR G 64 | A NA LY T I CS - MAGA Z I NE . OR G
career options, but it was also a great boost
to my self-esteem. I perform a small amount
of consulting, and I may wish to expand that
in the future. Certifcation is not a priority for
promotion for a research academic, but I ar-
gue we should “practice what we preach.”
Certifcation lends authority to my classroom
lessons, and confdence that I am leading
my students in the correct direction.
How did you prepare for the exam? I
prepared earnestly for the exam, start-
ing about three months in advance, but
mostly devoting about 60 hours of study
in the last few weeks. I worked the prac-
tice questions and interpreted a wrong
answer as a fag to relearn whole top-
ics. I read many of the suggested refer-
ence texts. Many of the exam questions
concerned client relations and project
management, and my prior consulting
experience really informed my answers.
How do you expect certifcation to im-
pact your career going forward? Cer-
tifcation will be a great marketing tool if
I ever decide to increase my consulting
work, and it certainly makes me feel good
CERTI FI ED ANALYTI CS PROFESSI ONAL
Subscribe to Analytics
It’s fast, it’s easy and it’s FREE!
Just visit: http://analytics.informs.org/
help me be a part of a network of profes-
sionals in the feld and a to be leader in
the feld of analytics.
ALAN L. AUSTIN, CAP®
Director of Business
Development
Breakthrough
Consulting Group
Why did you pursue certifcation? For
me the path to certifcation was perhaps
a little different from some of my fellow
S E P T E MB E R / OCT OB E R 2013 | 65
A NA L Y T I C S
CAP
®
s. I didn’t have an employer expect-
ing me to achieve additional credentials.
Certifcation as an analytics profession-
al wasn’t the next logical box for me to
check on my work skills portfolio. In fact,
I wasn’t even serving in a role where a
particular need for analytics certifcation
existed. I just knew that once INFORMS
announced the initiative, I had to try.
How did you prepare for the exam? My
exam preparation included carefully re-
viewing the “Analytics Job Task Analysis”
WWW. I NF OR MS . OR G 66 | A NA LY T I CS - MAGA Z I NE . OR G
information on the INFORMS website.
The information about the seven do-
mains made it fairly straightforward for
me to identify those areas where I might
need to brush up my skills. Then it simply
became a matter of blocking time into my
schedule to cover the target study areas
prior to examination.
How do you expect certifcation to
impact your career going forward?
I am proud to be a Certifed Analytics
Professional, regardless of how great or
small an impact certifcation has for me
in terms of my career. For me, achiev-
ing certifcation as an analytics profes-
sional is more about remembering who
I am and how important it is to me to try
to make sense of the data storm in which
we live. Those of us who thrive on the
detective work that is business analytics
do have an opportunity to make major
contributions to the organizations where
we work and the society in which we live.
But in my opinion the greatest beneft to
being analytics-oriented is that it makes
the world a fascinating place.
MARK COLOSIMO,
CAP®
Global Director
Integrated Analytics
for Urban Science
Why did you pursue certifcation?
Having analytics credentials will become
more and more important as the use of
data to make business decisions esca-
lates. I also wanted to examine its value
in representing analytical knowledge and
experience for future hiring and internal
training.
How did you prepare for the exam?
Actually, I just reviewed the practice
questions that were provided in the cer-
tifcation program literature. Having a
demanding job and family, while working
on a Ph.D., I had little time to do more. A
few literature samples were stated in the
handbook, but I did not have the needed
time to review them. When I frst reviewed
the exam outline, I quickly realized that
the content did represent what is done
on a regular basis in practice. Nonethe-
less, I think the exam exemplifes what
real-world analytics professionals experi-
ence and, perhaps, little “prep” is really
required for those that have substantial
experience in the areas represented.
Still, for those with less or more specifc
experience, a test preparation course or
CERTI FI ED ANALYTI CS PROFESSI ONAL
Help Promote Analytics Magazine
It’s fast and it’s easy! Visit:
http://analytics.informs.org/button.html
S E P T E MB E R / OCT OB E R 2013 | 67
A NA L Y T I C S
additional sample questions might be
helpful.
How do you expect certifcation to
impact your career going forward?
I am not sure. The value of the certif-
cation lies with the continued success
of INFORMS in its representation of
analytics leadership through education,
application and analytics utilization in
various disciplines. For me, the certif-
cation is something that I will ask my fel-
low analytics professionals to obtain as
a symbol of their experience and knowl-
edge in analytics.
For more on the CAP
®
program and
upcoming exam dates and sites includ-
ing Boston (Oct. 3), Minneapolis (Oct. 5),
San Francisco (Nov. 6), New York (Nov.
12), Toronto (Dec. 12) and Tuscaloosa,
Ala. (Jan. 29, 2014), click here. ❙
Gary Bennett ([email protected]) is the
director of marketing for INFORMS. Peter Horner
([email protected]) is the editor of
Analytics and OR/MS Today (the membership
magazine of INFORMS).
WWW. I NF OR MS . OR G 68 | A NA LY T I CS - MAGA Z I NE . OR G
Part two of a two-part series on best practices for
analyzing multi-lingual text.
In the frst part of this series
[1], we examined how to re-
trieve multi-lingual text with-
out prior knowledge of the
language while keeping the native script
intact in order to preserve meaning. Once
inputs are acquired from social media, ex-
ternal Web pages and even internal fle
system documents (like e-mail, transac-
tion scripts and archives), the objective is
to understand what the text refers to and
its associated signifcance to the analysis
and business objective.
While text analysis models are de-
fned to the native language to capture
the intended meaning of the author, the
application of multi-lingual text models to
categorize content or identify sentiment
can be completed by a non-native speak-
er. Although translation technology can be
used to frame text in a single language, it is
at best a secondary strategy to native lan-
guage analysis. Beyond issues of idioms,
sentence level phrases vary between lan-
guages as illustrated in part one of this se-
ries: “Snèlla, kür mig till ett roligt stele” said
to a cab driver will take you to a fun place
in Sweden or a cemetery in Denmark.
The good news is that integrated text
analytics software can be used to auto-
matically identify the native language and
to apply the correct encoding methods to
Lost in
Translation II
BY CHRISTOPHER BROXE (LEFT)
AND FIONA MCNEILL
I
TEXT ANALYTI CS
S E P T E MB E R / OCT OB E R 2013 | 69 A NA L Y T I C S
statistically mine multi-lingual text inputs.
Furthermore, the nomenclature of natural
language processing (NLP) rule defnition
for detailed sentiment and advanced lin-
guistic analysis shouldn’t appreciatively
change by language. Technology exists to
similarly defne text analysis models and
rules, although detailed discussion of how
that is done is beyond the scope of this ar-
ticle. What we do consider in this second
installment is how to apply multi-lingual
text models and distill meaning from the
contents in a digestible manner, illustrated
with examples using SAS software.
APPLYING MULTI-LINGUAL TEXT
MODELS
Document processors are applied to
the defned, crawled documents that now
exist in separate fles resulting from the
native language information retrieval de-
scribed in part one of this series. A pipe-
line of activity is defned to the system to
iterate the steps needed to process the
inputs beginning with a pre-processing
step of flter to automatically identify the
fles to the associated language [2]. An
example of steps that might be used post
language fltering to analyze the docu-
ment contents is illustrated in Figure 1.
At this point we can be certain that
the document being analyzed will be in
English, given the materials have been
fltered on “language=ENGLISH.” Text
analytics processing steps defned in the
document processor add metadata to the
corresponding documents based on the
analysis of the document contents itself.
Figure 1: Text analytics pipeline of activity (in English) with “language=ENGLISH” predefned to
isolate English documents for further analysis.
WWW. I NF OR MS . OR G 70 | A NA LY T I CS - MAGA Z I NE . OR G
The frst text processing activity is to ex-
tract the date – a common defnition can
be used, regardless of language. Once
extracted, a flter is applied to verify that
the body feld of the defned document is
not empty and that text is actually pres-
ent to be examined in successive steps.
CATEGORIZE AND EXTRACT DESIRE
ELEMENTS
The “content categorization” post
processor [3] will categorize the docu-
ment into one or many relevant groups
depending on the extent to which the de-
sired elements are present in the docu-
ment. These are based on specifcations
created using linguistic rules (developed
using automatic statistical methods),
automatic linguistic methods, user speci-
fed NLP rules or predefned rules (a.k.a.
taxonomies), or any combination of these
methods [4]. As an example, a document
from the BBC discussing the Syrian con-
fict (in English) can be automatically cat-
egorized into “social unrest,” – wars and
conficts and so on. At this stage of text
processing we can also extract concepts
and facts, such as the names of people,
locations and organizations based on
predefned linguistic rule specifcation. A
variety of methods exist to extract con-
cepts, everything from simple lookups
and classifers (i.e., text strings) to more
advanced contextual extraction, where
parts-of-speech, operators and condi-
tional defnitions can be used to identify
TEXT ANALYTI CS
Figure 2: Multiple languages can be specifed to a single taxonomy, with English categories on
the left associated with linguistic rules defned in Arabic.
S E P T E MB E R / OCT OB E R 2013 | 71
A NA L Y T I C S
desired elements in the documents. Spec-
ifcations for extraction within paragraphs
or sentences can be mixed with desired
nouns, verbs, prepositions and other
parts of speech – denoted with separa-
tors, distances between terms, negation
and other building blocks that defne the
extraction rules used to pinpoint desired
aspects of the text.
As an example, suppose we have the
text:
“Bill Clinton made a visit to the newly
renovated terminal at John F. Kennedy
International Airport in New York. Mr.
Clinton made a comment, “It’s the best
looking terminal I’ve seen!” and even
though airport offcials told reporters that
they unfortunately were $2 million over
budget, the project was a success. It
was the frst visit of the president in N. Y.
City this year. Another guest at the press
conference was the director of SAS, who
told the media that they were starting up
scheduled fights from Gothenburg, Swe-
den, to JFK Airport in June. After leaving
the airport, President Clinton went off to
another meeting at the Free Library of
Philadelphia.”
A computer may have diffculty deci-
phering these sentences. A simple text
analysis engine might return the follow-
ing results:
• Person: Five different people are
mentioned, namely Bill Clinton, John
F. Kennedy, JFK, Mr. Clinton and
President Clinton.
Figure 3: Using the content categorization post-processor, derived categories become facets of
the document collection that can be searched and explored by end-users.
WWW. I NF OR MS . OR G 72 | A NA LY T I CS - MAGA Z I NE . OR G
TEXT ANALYTI CS
Figure 4: Structured native language text data, with corresponding derived felds exported from a
post processor to a database, viewed in SAS’ Enterprise Guide.
• City: Three cities are described
as New York, Philadelphia and
Gothenburg.
• Organization/Company: One
organization is mentioned, SAS,
but is it SAS Inc. (the software
company), SAS Airlines or SAS
Special Air Service of England?
A more discriminating, and indeed
better, analysis would be able to extract
the following from the same paragraph:
• Person: Bill Clinton, with identifed
co-reference to both “Mr. Clinton,”
“the President” and “President
Clinton.”
• City: New York, (with co-referenced
“N.Y. City”), Gothenburg
• Place: John F. Kennedy International
Airport (co-referenced as JFK
Airport) and Free Library of
Philadelphia.
• Sentiment [6]:
Overall paragraph sentiment is
positive
Terminal: positive – “It’s the best
looking terminal I’ve seen!”
Investment: negative – “they
unfortunately were $2 million over
budget.”
Project: positive – “the project was
a success.”
WWW. I NF OR MS . OR G 74 | A NA LY T I CS - MAGA Z I NE . OR G
Organization/Company: SAS
(co-referenced as Scandinavian
Airlines).
After the content categorization post
processor has classifed and extracted
desired aspects of the text, the “senti-
ment analysis” post processor is executed.
Similar to that of content categorization,
predefned taxonomies and NLP rules are
applied; however, in this step the objec-
tive is to identify and assess the sentiment
expressed in the text. Not only is overall
document sentiment defned but so is the
sentiment associated with any desired
aspect [6], such as “homeland security,”
“economics,” “perception” and so forth.
Once the sentiment is defned, this par-
ticular activity stream ends with exporting
results.
The generated metadata along with
the corresponding documents are export-
ed, frst to XML version and then to a da-
tabase. This same process is defned and
executed for every unique language, ap-
plying the specifc categorization, extrac-
tion and sentiment text models with the
associated language flter, which associ-
ates the correct language specifc model
to the corresponding language document.
TEXT ANALYTI CS
Figure 5: Drillable report listing the top fve people, locations, organizations and topics
discussed in both Arabic and English from a variety of sources.
a document matches the “gang activity”
category, the document itself can be in
Arabic, French, English or any other na-
tive language.
Text data is now structured and can
be examined in a variety of ways. With
the results exported to XML, the fles are
indexed and can be readily searched and
retrieved based on the facet categories
created in the post processing, as illus-
trated in Figure 4 on page 72. These cat-
egories index the document collection
and can be interactively explored with
the document contents residing in its na-
tive language format. In Figure 4 we see
this for Arabic documents and the related
WWW. I NF OR MS . OR G 76 | A NA LY T I CS - MAGA Z I NE . OR G
Metadata, on the other hand, created
as a product of this processing, is lan-
guage independent. Once the language-
specifc rules are defned, the documents
are scored to the taxonomy specifcation.
As such, when a category match occurs
and the associated metadata is gener-
ated, it is in a common language. As a
result, text across multiple languages
can be examined in totality. So even if
TEXT ANALYTI CS
Figure 6: Associations across derived categories can be visualized for all language documents.
Help Promote Analytics Magazine
It’s fast and it’s easy! Visit:
http://analytics.informs.org/button.html
Application of text analysis models can
also be done without any language re-
quirements. The text models designed to
identify and decipher meaning using lin-
guistic analysis can be created in a com-
mon environment, with different models
developed for each unique language.
The results, enhanced with language in-
dependent metadata that has structured
the text, can readily be explored, summa-
rized and visualized, enabling decisions
based on all the information, and not just
what a translator has obfuscated. ❙
Christopher Broxe is a business solutions
manager and Fiona McNeill is a global marketing
manager at SAS.
REFERENCES
1. Referring to http://viewer.zmags.com/publicatio
n/3a28b0ac#/3a28b0ac/56, May/June 2013 edition,
published by Analytics-Magazine.org.
2. SAS linguistic technologies, including content
categorization, support 30 different native
languages, in addition to individual dialects.
3. A “language_identifcation” processor was
described in part one of this series, which outputs
a feld called “language” enabling a flter to be
defned, say “language=ARABIC.” By creating
a language flter, documents are streamlined to
processing activity that has been defned for that
specifc language (such as only Arabic documents
being assessed using the Arabic text analytics
pipeline processors).
4. This is a post processor, as it is run after the
documents have been identifed by the crawler
5. SAS provides methods for all of these different
types of linguistic specifcations with automatic
methods often being used to initiate linguistic
specifcation, which can be further refned by the
end-user.
6. Sentiment can be derived for detailed features
in text provided the topics are defned as part of a
multi-level hierarchy, as these results infer.
S E P T E MB E R / OCT OB E R 2013 | 77
A NA L Y T I C S
categories through which the user can
browse is derived from the Arabic taxon-
omy defned to the content categorization
processor. By clicking on an item in that
hierarchy, say “Election,” the documents
are fltered to only list texts that discuss
that issue.
And with the database export the
structured text documents can be ex-
plored and described with other tools de-
signed to analyze data defned in rows
and columns, as illustrated in Figure 5 on
page 74.
Reports and visualizations summariz-
ing total population discussions, materi-
als and attitudes can be created from this
structured text, and include all language
documents given the language indepen-
dence of the derived metadata, such as
illustrated in Figure 5 and Figure 6. Filters
and drill-paths enable interactive exami-
nation to detail language specifc results.
CONCLUSION
Advanced crawling, parsing, lan-
guage recognition and text analysis all
contribute to understanding unstructured
data. Clarifying the meaning comes from
analyzing what the author intended, de-
scribed in the native language. Multi-
lingual information from social media,
Web pages and even existing document
collections can be accessed without the
need to comprehend various dialects.
WWW. I NF OR MS . OR G 78 | A NA LY T I CS - MAGA Z I NE . OR G
The Winter Simulation Conference (WSC) has
been the premier international forum for disseminat-
ing recent advances in the feld of system simulation
for more than 40 years, with the principal focus being
discrete-event simulation and combined discrete-con-
tinuous simulation. In addition to a technical program
of unsurpassed scope and high quality, WSC provides
the central meeting place for simulation researchers,
practitioners and vendors working in all disciplines
and in industrial, governmental, military, service and
academic sectors. WSC 2013 will be held Dec. 8-11
at the JW Marriott Hotel in the heart of Washington,
D.C.
The appeal of simulation is its relevance to a di-
verse range of interests. WSC has always refected
this diversity and WSC 2013 aligns with and ex-
pands upon this tradition. For those more inclined
to the academic aspects of simulation, we offer
tracks in modeling methodology, analysis method-
ology, simulation-based optimization, simulation for
decision-making and agent-based simulation. For
those more inclined to the application of simulation,
tracks include health care, supply chain manage-
ment, military applications, project management
Winter Simulation
Conference 2013
BY RAYMOND R. HILL
Premier international forum
for disseminating recent
advances in the field of
system simulation set for
Washington, D.C.
CONFERENCE PREVI EW
S E P T E MB E R / OCT OB E R 2013 | 79 A NA L Y T I C S
and construction, homeland security
and emergency response, environ-
mental and sustainability applications
and applications in social science and
organization.
The Modeling and Analysis
of Semiconductor Manufacturing
(MASM) is a conference-within-a-
conference featuring a series of ses-
sions focused on the semiconductor
field. Back for WSC 2013 is the In-
dustrial Case Studies track affording
industrial practitioners the opportunity
to present their best practices to the
simulation community. Finally, WSC
provides a comprehensive suite of
introductory and advanced tutorials
presented by prominent individuals in
the field along with a lively poster ses-
sion, Ph.D. colloquium, a newcom-
ers’ orientation and a distinguished
speaker lunchtime program.
The theme for WSC 2013, “Simula-
tion: Making Decisions in a Complex
World,” is timely and relevant. The con-
ference keynote speaker is Dr. Eric
Bonabeau, founder, CEO and chief sci-
entifc offcer of ICOSYSTEM, and one
of the world’s leading experts in complex
systems and the application of complex-
ity science to real-world problems. The
military keynote speaker is Jeff Cares, a
naval captain, the founder and president
of Alidade Incorporated, and one of the
top thought leaders in “Information Age”
innovation.
The WSC is designed for profession-
als at all levels of experience across broad
ranges of interest. The extensive cadre
of exhibitors, the meetings of various
professional societies and user groups,
along with the various social gatherings,
give all attendees the opportunity to get
acquainted with each other and to be-
come involved in the ever-expanding
activities of the international simulation
community.
For further information, click here. ❙
Raymond R. Hill is the general chair of WSC
2013.
WWW. I NF OR MS . OR G 80 | A NA LY T I CS - MAGA Z I NE . OR G
On general principle, I do my best to avoid zombies.
However, the increasing number of zombie games,
movies, and even an academic paper [1] have con-
vinced me that zombies are worth fve minutes of effort.
Zombies are presumably diffcult to combat for
two reasons: First, they are already dead, so killing
them takes a little more “umph.” Secondly, if you are
in a group of people fghting zombies some of your
unfortunate comrades will become dead-dead, but
others may become un-dead, making more zombies
to fght. Zombies can also be diffcult to model. First,
there are no historical zombie scenarios (and hope-
fully there never will be [2]). Second, everyone has
his or her own opinions about what zombies would do
in a fght. These differences don’t mean that we can’t
think about it analytically.
Our analysis proceeds as follows: Consider a group
of zombies, who at any time may infect a human, kill a
human or spontaneously die – most zombies don’t look
too healthy and we assume that they die of (un)natural
causes. Zombies may use weapons to kill humans, so
they don’t necessarily need to be in direct contact with
them [3], but they do require contact to infect [4]. Simi-
larly, humans may be infected or killed by zombies. We
assume that humans die from non-zombie causes at a
rate that is negligible for the timeframes we consider.
Let Z be the number of Zombies, B be the number of
Humans. Let β be the rate at which humans kill zombies.
Let χ be the rate at which zombies convert humans to
zombies (make un-dead) and let ρ be the rate at which
Modeling zombies
BY HARRISON
SCHRAMM, CAP®
Zombies can also be
difficult to model. First,
there are no historical
zombie scenarios. Second,
everyone has his or her
own opinions about what
zombies would do in a
fight. These differences
don’t mean we can’t think
about it analytically.
FI VE-MI NUTE ANALYST
S E P T E MB E R / OCT OB E R 2013 | 81 A NA L Y T I C S
zombies kill humans (make dead-dead).
Let ξ be the rate at which zombies die due
to being, well, zombies.
This yields:
dt
dB
=
– βZ – χBZ
dt
dZ
=
χBZ – ρB – ξZ ,
as a deterministic model for fghting
zombies.
Analyzing a model like this is more
about gaining insights than any particu-
lar solution. In most analytic settings, a
detailed discussion of numerical results
or integration techniques may lead your
clients to spontaneously become zom-
bies. Analysis is not about the “eaches”
of mathematics, but it’s about insights
that decision-makers can use.
It is fairly obvious that if zombies do
not naturally die off in (ξ = 0), the zom-
bie population will be constant if zombies
convert humans at the same rate that hu-
mans kill zombies (χZ – ρ = 0), and the
more zombies there are, the easier it is
for zombies to convert humans. The hu-
man population is always decreasing so
long as zombies are present.
Writing differential equation models
and exploring them parametrically can
Figure 1: A typical instantiation of our zombie combat model. Note that the two populations are
equal around t = 10, but the rate of zombie recruitment slows because it depends on the product
BZ. For this particular case, B
0
= 200, Z
0
= 100, ρ = .05, χ
rel
= .2, β = .12, ξ = .055.
WWW. I NF OR MS . OR G 82 | A NA LY T I CS - MAGA Z I NE . OR G
help tease out the insights from the prob-
lem. We note that the zombie-conversion
factor, χ, does not have the same scal-
ing as the others model parameter; it is in
units of “zombies per zombies times hu-
mans,” while all the other parameters are
in terms such as “zombies per human.”
This kind of dimensional mismatch can
cause grave errors. In order make the
units work out, we create a new param-
eter, χ
rel
= χ / .5 (B
0
+ Z
0
), which is more
natural to work with for comparisons. A
little exploratory analysis also shows that
the rate that zombies convert humans
FI VE-MI NUTE ANALYST
Figure 2: Sensitivity to the relative infection parameter χ
rel
. In this scenario, humans are 10-times
effective at killing zombies than zombies are at killing humans. Parametric exploration shows
that the critical infection value is approximately .067, above which the zombies are victorious.
Although each side begins with 200 active, the zombie population may fnish above its initial
number due to recruitment.
Request a no-obligation INFORMS Member Benefits Packet
For more information, visit: http://www.informs.org/Membership
S E P T E MB E R / OCT OB E R 2013 | 83
A NA L Y T I C S
drives the result more than any other
parameter. Sensitivity analysis of χ
rel
is
shown in Figure 2.
STOCHASTIC ZOMBIES [5]
The equations above don’t really tell
the whole story if the number of initial
zombies is small (it usually is), and the ini-
tial zombie begins in a far-away place (the
frequently do). Now, we might assume
away the possibility that a human would
kill zombie-Prometheus, since nobody
was expecting to meet a zombie on the
street after a movie in a city in broad day-
light. Still, the frst zombie needs to fnd a
human to infect before he dies of (un)nat-
ural causes. A full treatment of this model
will take more than fve minutes; however,
we can condition on the frst transition out
of the state Z = 1 by using a continuous
time Markov chain (CTMC).
To conclude, while it seems that
looking at zombies is a (hopefully
entertaining) diversion, there is a real
point to this piece. Proposing and analyz-
ing simple deterministic models can be
useful in teasing out insights which may
be broadly true. I have heard that zom-
bies are afraid of green engineer’s paper
– this is why I always carry some! ❙
Harrison Schramm (harrison.schramm@gmail.
com) is an operations research professional in
the Washington, D.C., area. He is a member of
INFORMS and a Certifed Analytics Professional
(CAP
®
).
Figure 3: CTMC representation of the frst transition away from the initial zombie. We see that the
more humans present, the greater the possibility that the zombie infection will spread. Intuitively,
the more people B are in the immediate vicinity of the proto-zombie, the greater the chance zombies
will spread.
REFERENCES & NOTES
1. P. Munz, I. Hudea, J. Imad, R. Smith, 2009,
“When zombies attack: mathematical modeling of
an outbreak of zombie infection,” in “Infectious
Disease Modeling Research Progress,” Nova
Science Publishers.
2. My mental references for this work are
“Zombieland,” “Resident Evil” and Poe’s
“Masque of the Red Death.”
3. For those familiar, we use Lanchester aimed-
fre as a model.
4. Hence the appeal to infectious disease models.
5. This is both an effect seen when I used to
lecture on applied probability or a fantastic name
for a mathcore group.
WWW. I NF OR MS . OR G 84 | A NA LY T I CS - MAGA Z I NE . OR G
John Toczek is the director of
Decision Support and Analytics for
ARAMARK Corporation in the Global
Risk Management group. He earned
a bachelor’s of science degree in
chemical engineering at Drexel
University (1996) and a master’s
degree in operations research from
Virginia Commonwealth University
(2005). He is a member of INFORMS.
WWW. I NF OR MS . OR G 84 | A NA LY T I CS - MAGA Z I NE . OR G WWW. I NF OR MS . OR G 84 | A NA LY T I CS - MAGA Z I NE . OR G
Urban planning requires careful placement and dis-
tribution of commercial and residential lots. Too many
commercial lots in one area leave no room for residential
shoppers. Conversely, too many residential lots in one
area leave no room for shops or restaurants.
The 5x5 grid in Figure 1 shows a sample confguration
of residential and commercial lots. Your job is to reorder
the 12 residential green lots and 13 commercial red lots
to maximize the quality of the layout.
The quality of the layout is determined by a points sys-
tem. Points are awarded as follows:
Any column or row that has fve residential lots = +5 points
Any column or row that has four residential lots = +4 points
Any column or row that has three residential lots = +3 points
Any column or row that has fve commercial lots = -5 points
Any column or row that has four commercial lots = -4 points
Any column or row that has three commercial lots = -3 points
For example, the layout displayed in Figure 1 has a
total of 9 points:
Points for each column, from left to right = -3, -5, +3, +4, +3
Points for each row, from top to bottom = +3, +3, +3, +3, -5
QUESTION: What is the maximum number of points you
can achieve for the layout?
Send your answer to [email protected] by Nov. 15.
The winner, chosen randomly from correct answers, will
receive a Magic 8 Ball. Past questions can be found at
puzzlor.com. ❙
Urban planning
BY JOHN TOCZEK
THI NKI NG ANALYTI CALLY
Figure 1: Reordering commercial
and residential lots to maximize the
quality of the layout.

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close