Deep learning and shallow understanding

Martyn Richard Jones

Bruxelles 21^st May 2019

Hi, I'm Martyn Jones over at GoodStrat.Com (the Good Strategy Company) And today I want to talk about something I like to call deep learning and shallow understanding.

To begin at the beginning

In spite of or perhaps because of the many years I worked in artificial intelligence I believe that the current long-distance love affair with AI and what is also euphemistically termed ‘deep learning’ to be somewhat irrational in its exuberance.

In the eighties I was experimenting with automatic feature extraction, pattern recognition and parallel distributed processing (see Rumelhart and McClelland), which led me to adaptive neural networks.

As a result, my research and development in this field showed me that automatically mining nuggets of gold or automatically gaining real insight from data is not easy, far from stable and frequently not achievable. Interesting experiments, but frequently impractical in a business setting.

Back in the day I was interested in examples of possibilities in this ‘technology space’ and in trying to design, train and productise neural networks. At Unisys we started out trying to do things such as refining the predictability that a person would be a member of the Sharks or the Jets (New York gangs, if I remember rightly) and applying that in divining the credit risks of individuals and businesses, or elsewhere we were creating simple applications for handwritten number recognition and (a bit more sophisticated) basic document recognition.

To that end, I addressed the European IEEE conference on Neural Networks in Nice regarding our work in the USA and at the European Centre for AI and Advanced Information Technologies. There in Madrid I was working at the intersection of AI and advanced database technologies, with peers globally across the industry. At the same time, my company, Sperry Univac was the go-to IT Company for engineers, designers and computer scientists.

Sperry and Unisys gave me license to research, design and build proof-of-concept prototypes in the areas of complex very-large database management, a tightly integrated 4GL and expert system development platform (extending a 4^th generation language product called Mapper), heterogenic cross-platform database query management (a project for the EU), and AI and data mining; the deep learning of today.

But, central to this blog piece was my work in the area of AI and data mining. The goal I set myself was to find ways of extracting business-oriented rules from the data-mining process and to be able to apply and explain rule-based lines of reasoning in a way that a subject matter expert could understand and find credible.

Keep in mind that the ability for AI technology to produce explanations of lines-of-reasoning was an absolute necessity back in the day. Which was reasonable but extraordinary, because not so long ago we didn’t even expect any sort of rational line-of-reasoning explanations from many ‘experts’.

So, what?

If we remove statistics from the scenario, the automatic recognition of patterns in databases and naïve learning whilst having interesting pasts, a fuzzy and nuanced present and an uncertain future, are terrains plagued with shallow and imprecise understanding, exaggeration and make-believe; which makes any real and meaningful understanding quite problematic.

The fact of the matter is that there is plenty of visionary noise surrounding deep learning and AI, but very little in terms of concrete strategies to address truly significant challenges using this technology. There is also lot of speculative fantasy about the promise of AI, but very little in terms of coherent and tangible answers to the absolutely fundamental question of “to what ends?” On top of that, there hasn’t been an entirely satisfactory explanation given as to why we would place much trust in technology that can’t even explain itself? Is “the computer says no” really good enough?

Now I want to briefly look at what I call the lessons of the past, the realities of the present and the promises of the future.

Some lessons of the past

Over almost six decades, AI has seen a few peaks of hype followed by prolonged troughs of incredulity, disappointment and recrimination. History has shown us that much of the promise of AI comes unstuck when immature, unsound or unproven technology is taken to market too soon and without a reasonable understanding of its applicability; when companies embrace and spend significant time and effort trying to exploit things they don’t comprehend in order to achieve ends they can’t define in tangible, reasonable and realistic terms.

So, interesting experiments in AI are brought out of academia far too soon, hyped to the heavens and eagerly acquired (if not ultimately used) by commercial IT. The fact is that IT companies (and more recently IT service companies) have unwittingly killed AI time and time again, through their own irrational exuberance; bringing into question more than just the technology.

Then we have the issue of hyperbole and AI. There is no end to the charlatans waiting in the wings to big-up the latest tech trend. But, when things go pear-shaped it’s the business cultures who will seek to destroy something that promised to deliver so much, yet failed to deliver anything other than liabilities, costs and wasted opportunities.

In the eighties one of my major projects in AI was to design and tightly integrate an easy to use Expert System development and delivery capabilities with my company’s flagship 4GL product (Mapper, now known as Unisys BIS), it was a rules-based system. It worked well. But one of the major limitations of all Expert System shells at that time was the ability to manage and maintain large rule sets. It is the first time I realised that without adequate tools to manage complexity then business and technology risk exposure would become quite an issue. As a result, I tried to identify ways to simplify the tools without losing intrinsic value and to provide the required tools to help reduce the perception of complexity. It was a partial success.

Later I had a project to address a classification challenge and for this I decided to look at data mining again. Using small training-datasets we initially had some interesting results, but then what we noticed was that when we increased the size of the datasets that learning became skewed, and not in a good way. So, we pondered the problem and decided that we weren’t getting the right answers because we weren’t using enough data, so our inability to feed it appropriately was producing the wrong outcomes. That’s when we hit a brick wall of an idea. We threw tons of data at the neural network. So much so the neural network was incapable of discriminating or discerning anything at all. We had created a narrow and shallow idiot-savant and trained it so well that it eventually knew nothing.

I mentioned some of these anecdotes to my colleagues at IBM’s global data mining centre in Dublin. To my surprise, they had stumbled upon exactly the same issues and had tried to resolve them in exactly the same way, to no avail.

Another project I was involved in was at a major investment bank. The purpose was to build an artificial-trader. To achieve this it was decided to capture as rules the expertise of expert-traders and to combine the evaluation and execution of these rules with networks trained by the data mining of price curves and historic trading data in order to build a functioning artificial-trader. When the application was eventually trialled three observations were made: The AI trader performed marginally better than the real expert-trader; the real trader came a close second; and, when the trader worked with the artificial-trader, the outcomes were far worse than when the trader worked alone. Also, there was the belief that the trader performed less well because of the active benchmarking against a machine.

Some realities from the present

The more things change the more they stay the same as each new generation of techno-babbler tries to reinvent the wheel, whether it is needed or not, whilst astutely ignoring the sage advice of “if it ain’t broke, don’t fix it”.

What we see again is an ingrained inability to learn from the past or even the present, and too judicially apply those rules in evaluating the present. This seems to be a constant in the evolution of humankind, made more obvious by way of the use and abuse of technology, hyperbole and ignorance.

What are the realities of the present? Here are a few:

People are either afraid of complexity and prefer not to see it or embrace it without really understanding the implications. The complexity of significant challenges is being ignored and so is the complexity and risks of technological and process options.

People don’t know how to disambiguate or deconstruct complexity. In short, people don’t know how to consume the elephant, and people who do know aren’t that much appreciated either.

No alignment of imagination and the practical. People don’t know how to successfully simplify the complex and those who do know are treated like gods, fools or pariah.

People are willingly and uncritically embracing techno-fad dogma. The visibility and audibility of tech-fad slogans have gone beyond the sloganizing used in tractor factories in the communist era.

That best engineering principles are just so much academic theory.

That more data means better data. More data doesn’t mean better data. It’s a dopey generalisation without any reasonable theoretical or practical underpinning.

That data is the new oil. Try telling that to your car engine. No data is not like oil. It’s quite a dopey analogy.

That reasonable explanation is not a thing. If a machine is being used to produce recommendations for anything that requires duty-of-care then they must also be able to produce a reasonable, accurate and verifiable line of reasoning, at the same time. Also, legal and compliance issues are also important considerations.

A data scientist can do the job of a qualified statistician. “Without a grounding in statistics, a Data Scientist is a Data Lab Assistant.”

It isn’t important to know what it is, just use it. Just because you can pick up a tool it doesn’t mean you know how to use it. Just because it’s called a tool doesn’t necessarily mean that it is. If all you want is one hammer, being given one thousand hammers at the same time doesn’t help as much. If what you want to do is boil an egg then a hammer isn’t really the thing.

In short, the present reality is the presence of a surfeit of actors in the big data, deep learning, data science and AI technology solutions spaces, who are running around aimlessly, throwing faeces and feasting upon the hype and hyperbole of it all, like acid-dropping chimps at a chimpanzee’s tea-party, run amok.

Some promises for the future

What does the future offer? Probably more of the same, but more so.

However, if I take an optimistic view, here are some areas where I think we might make some advances:

The development of tools for managing and deploying rule-based expertise into apps.

The development of tools for reducing the complexity of managing and maintaining rule-bases that can be used, for example, in the automatic generation of rule-centric APIs.

Using data governors with elements of AI to actually reduce the volume of streaming data-as-noise.

The evolution of tools for treating data as an asset, whether as a value-adding asset, an asset of no apparent value or a liability.

Using explainable AI to challenge and to potentially negate the predictions of data mining apps. Thereby limiting the damage that such an app could inflict.

Socialising Expert System development and delivery. Would Microsoft like to embed an expert system shell in Excel, for example? I know someone who has done that.

We must insist that AI must either be explainable or fully controllable.

Just as we have for data we must also have AI governance and General AI Protection Regulation.

Finally, reducing the buzzword bingo bullshit and term abuse in IT, AI and data. That people take the time to learn what things actually mean and where things are really applicable or not. That people understand why using terms that they don’t understand is a very bad idea.

Summary

Now, wouldn’t that be nice? Of course, there is a lot more than that. But, blogs being blogs…

With such a subject it is easy to become conflicted.

At the back of my mind, I have the idea that deep learning is just God’s way of telling companies that they have too much money and not enough sense. I also believe that a lot more work needs to be done in investigating feature extraction from data before we can even begin to consider it as a maturing technology.

That said. The world of data and use of that data can be an exciting place. Maybe one day not too far away someone will come up with a useful AI or data technology that doesn’t actually require hyping.

Many thanks for reading.

Have a beautiful June 2019.

Martyn Jones

One of the sales execs looked at a TI Explorer workstation I was using and asked… can it do accounts? Then followed it up with “it’s no better for business than a bloody expensive anchor”.

lunes, 3 de junio de 2019

Deep learning and shallow understanding

Martyn Richard Jones

Bruxelles 21^st May 2019

To begin at the beginning

So, what?

Some lessons of the past

Some realities from the present

Some promises for the future

Summary

No hay comentarios:

Publicar un comentario

lunes, 3 de junio de 2019

Deep learning and shallow understanding

Martyn Richard Jones

Bruxelles 21st May 2019

To begin at the beginning

So, what?

Some lessons of the past

Some realities from the present

Some promises for the future

Summary

No hay comentarios:

Publicar un comentario

Bruxelles 21^st May 2019