Deep learning and shallow
understanding
Martyn
Richard Jones
Bruxelles 21st May 2019
Hi, I'm Martyn Jones over at GoodStrat.Com
(the Good Strategy Company) And today I want to talk about something I like to
call deep learning and shallow understanding.
To begin
at the beginning
In spite of or perhaps because of the many years I worked
in artificial intelligence I believe that the current long-distance love affair
with AI and what is also euphemistically termed ‘deep learning’ to be somewhat
irrational in its exuberance.
In the eighties I was experimenting with automatic
feature extraction, pattern recognition and parallel distributed processing
(see Rumelhart and McClelland), which led me to adaptive neural networks.
As a
result, my research and development in this field showed me that automatically
mining nuggets of gold or automatically gaining real insight from data is not
easy, far from stable and frequently not achievable. Interesting
experiments, but frequently impractical in a business setting.
Back in the day I was interested in examples of
possibilities in this ‘technology space’ and in trying to design, train and
productise neural networks. At Unisys we started out trying to do things such
as refining the predictability that a person would be a member of the Sharks or
the Jets (New York gangs, if I remember rightly) and applying that in divining
the credit risks of individuals and businesses, or elsewhere we were creating
simple applications for handwritten number recognition and (a bit more
sophisticated) basic document recognition.
To that end, I addressed the European IEEE conference on
Neural Networks in Nice regarding our work in the USA and at the European
Centre for AI and Advanced Information Technologies. There in Madrid I was
working at the intersection of AI and advanced database technologies, with
peers globally across the industry. At the same time, my company, Sperry Univac
was the go-to IT Company for engineers, designers and computer scientists.
Sperry and Unisys gave me license to research, design and
build proof-of-concept prototypes in the areas of complex very-large database
management, a tightly integrated 4GL and expert system development platform
(extending a 4th generation language product called Mapper), heterogenic
cross-platform database query management (a project for the EU), and AI and
data mining; the deep learning of today.
But, central to this blog piece was my work in the area
of AI and data mining. The goal I set myself was to find ways of extracting
business-oriented rules from the data-mining process and to be able to apply
and explain rule-based lines of reasoning in a way that a subject matter expert
could understand and find credible.
Keep in mind that the ability for AI technology to
produce explanations of lines-of-reasoning was an absolute necessity back in
the day. Which was reasonable but extraordinary, because not so long ago we
didn’t even expect any sort of rational line-of-reasoning explanations from
many ‘experts’.
So, what?
If we remove statistics from the scenario, the automatic
recognition of patterns in databases and naïve learning whilst having
interesting pasts, a fuzzy and nuanced present and an uncertain future, are
terrains plagued with shallow and imprecise understanding, exaggeration and
make-believe; which makes any real and meaningful understanding quite problematic.
The fact of the matter is that there is plenty of
visionary noise surrounding deep learning and AI, but very little in terms of concrete
strategies to address truly significant challenges using this technology. There
is also lot of speculative fantasy about the promise of AI, but very little in
terms of coherent and tangible answers to the absolutely fundamental question
of “to what ends?” On top of that, there hasn’t been an entirely satisfactory
explanation given as to why we would place much trust in technology that can’t
even explain itself? Is “the computer says no” really good enough?
Now I want to briefly look at what I call the lessons of
the past, the realities of the present and the promises of the future.
Some
lessons of the past
Over almost six decades, AI has seen a few peaks of hype
followed by prolonged troughs of incredulity, disappointment and recrimination.
History has shown us that much of the promise of AI comes unstuck when
immature, unsound or unproven technology is taken to market too soon and
without a reasonable understanding of its applicability; when companies embrace
and spend significant time and effort trying to exploit things they don’t
comprehend in order to achieve ends they can’t define in tangible, reasonable
and realistic terms.
So, interesting experiments in AI are brought out of
academia far too soon, hyped to the heavens and eagerly acquired (if not
ultimately used) by commercial IT. The fact is that IT companies (and more
recently IT service companies) have unwittingly killed AI time and time again,
through their own irrational exuberance; bringing into question more than just
the technology.
Then we have the issue of hyperbole and AI. There is no
end to the charlatans waiting in the wings to big-up the latest tech trend.
But, when things go pear-shaped it’s the business cultures who will seek to
destroy something that promised to deliver so much, yet failed to deliver
anything other than liabilities, costs and wasted opportunities.
In the eighties one of my major projects in AI was to
design and tightly integrate an easy to use Expert System development and
delivery capabilities with my company’s flagship 4GL product (Mapper, now known
as Unisys BIS), it was a rules-based system. It worked well. But one of the
major limitations of all Expert System shells at that time was the ability to
manage and maintain large rule sets. It is the first time I realised that
without adequate tools to manage complexity then business and technology risk
exposure would become quite an issue. As a result, I tried to identify ways to
simplify the tools without losing intrinsic value and to provide the required
tools to help reduce the perception of complexity. It was a partial success.
Later I had a project to address a classification
challenge and for this I decided to look at data mining again. Using small
training-datasets we initially had some interesting results, but then what we
noticed was that when we increased the size of the datasets that learning
became skewed, and not in a good way. So, we pondered the problem and decided
that we weren’t getting the right answers because we weren’t using enough data,
so our inability to feed it appropriately was producing the wrong outcomes.
That’s when we hit a brick wall of an idea. We threw tons of data at the neural
network. So much so the neural network was incapable of discriminating or
discerning anything at all. We had created a narrow and shallow idiot-savant
and trained it so well that it eventually knew nothing.
I mentioned some of these anecdotes to my colleagues at
IBM’s global data mining centre in Dublin. To my surprise, they had stumbled
upon exactly the same issues and had tried to resolve them in exactly the same
way, to no avail.
Another project I was involved in was at a major
investment bank. The purpose was to build an artificial-trader. To achieve this
it was decided to capture as rules the expertise of expert-traders and to
combine the evaluation and execution of these rules with networks trained by
the data mining of price curves and historic trading data in order to build a
functioning artificial-trader. When the application was eventually trialled
three observations were made: The AI trader performed marginally better than
the real expert-trader; the real trader came a close second; and, when the
trader worked with the artificial-trader, the outcomes were far worse than when
the trader worked alone. Also, there was the belief that the trader performed
less well because of the active benchmarking against a machine.
Some
realities from the present
The more things change the more they stay the same as
each new generation of techno-babbler tries to reinvent the wheel, whether it
is needed or not, whilst astutely ignoring the sage advice of “if it ain’t
broke, don’t fix it”.
What we see again is an ingrained inability to learn from
the past or even the present, and too judicially apply those rules in
evaluating the present. This seems to be a constant in the evolution of
humankind, made more obvious by way of the use and abuse of technology,
hyperbole and ignorance.
What are the realities of the present? Here are a few:
People are either afraid of complexity and prefer
not to see it or embrace it without really understanding the implications. The
complexity of significant challenges is being ignored and so is the complexity
and risks of technological and process options.
People don’t know how to disambiguate or
deconstruct complexity. In short, people don’t know how to
consume the elephant, and people who do know aren’t that much appreciated
either.
No alignment of imagination and the practical. People
don’t know how to successfully simplify the complex and those who do know are
treated like gods, fools or pariah.
People are willingly and uncritically embracing
techno-fad dogma. The visibility and audibility of
tech-fad slogans have gone beyond the sloganizing used in tractor factories in
the communist era.
That best engineering principles are just so much
academic theory.
That more data means better data. More
data doesn’t mean better data. It’s a dopey generalisation without any
reasonable theoretical or practical underpinning.
That data is the new oil. Try
telling that to your car engine. No data is not like oil. It’s quite a dopey
analogy.
That reasonable explanation is not a thing. If a
machine is being used to produce recommendations for anything that requires
duty-of-care then they must also be able to produce a reasonable, accurate and
verifiable line of reasoning, at the same time. Also, legal and compliance
issues are also important considerations.
A data scientist can do the job of a qualified
statistician. “Without a grounding in statistics, a
Data Scientist is a Data Lab Assistant.”
It isn’t important to know what it is, just use
it. Just
because you can pick up a tool it doesn’t mean you know how to use it. Just
because it’s called a tool doesn’t necessarily mean that it is. If all you want
is one hammer, being given one thousand hammers at the same time doesn’t help
as much. If what you want to do is boil an egg then a hammer isn’t really the
thing.
In short, the present reality is the presence of a
surfeit of actors in the big data, deep learning, data science and AI
technology solutions spaces, who are running around aimlessly, throwing faeces
and feasting upon the hype and hyperbole of it all, like acid-dropping chimps
at a chimpanzee’s tea-party, run amok.
Some
promises for the future
What does the future offer? Probably more of the same,
but more so.
However, if I take an optimistic view, here are some
areas where I think we might make some advances:
The development of tools for managing and deploying
rule-based expertise into apps.
The development of tools for reducing the complexity of
managing and maintaining rule-bases that can be used, for example, in the
automatic generation of rule-centric APIs.
Using data governors with elements of AI to actually
reduce the volume of streaming data-as-noise.
The evolution of tools for treating data as an asset,
whether as a value-adding asset, an asset of no apparent value or a liability.
Using explainable AI to challenge and to potentially
negate the predictions of data mining apps. Thereby limiting the damage that
such an app could inflict.
Socialising Expert System development and delivery. Would
Microsoft like to embed an expert system shell in Excel, for example? I know
someone who has done that.
We must insist that AI must either be explainable or
fully controllable.
Just as we have for data we must also have AI governance
and General AI Protection Regulation.
Finally, reducing the buzzword bingo bullshit and term
abuse in IT, AI and data. That people take the time to learn what things
actually mean and where things are really applicable or not. That people
understand why using terms that they don’t understand is a very bad idea.
Summary
Now, wouldn’t that be nice? Of course, there is a lot
more than that. But, blogs being blogs…
With such a subject it is easy to become conflicted.
At the back of my mind, I have the idea that deep
learning is just God’s way of telling companies that they have too much money
and not enough sense. I also believe that a lot more work needs to be
done in investigating feature extraction from data before we can even begin to
consider it as a maturing technology.
That said. The world of data and use of that data can be
an exciting place. Maybe one day not too far away someone will come up with a
useful AI or data technology that doesn’t actually require hyping.
Many thanks for reading.
Have a beautiful June 2019.
Martyn Jones
One of the sales execs looked at a TI Explorer
workstation I was using and asked… can it do accounts? Then followed it up with
“it’s no better for business than a bloody expensive anchor”.