David Tolfree, VP MANCEF
In the early 1980s when powerful so-called super-computers were just becoming realised, communication and information technologies were still developing and were much less sophisticated than we know them now. In those years as research physicists our ambitions to understand the structure of matter were restricted by a lack of data and an efficient means of analysing it. Desktop computers and mobile phones didn’t exist so recording was done by writing in notebooks. Then we were still in the era of analog. I spent many long days and nights trying to understand the results obtained from a shift on a particle accelerator during the previous day.
Data rates were measured in kilobytes/sec so experiments took long to complete. Errors bars on data points were often so large that statistical theory had to be evoked to make sense of results. The paucity of data and the difficulties of interpretation were largely attributed to the efficiency of the detector and measurement systems.
Accurate weather forecasting, diagnosing medical conditions, predicting future markets for manufactured products, making reliable business decisions and understanding astronomical observations are just some of the areas that suffered from the lack of data. Now we are in the digital and information age known as Industry 4, so that is history. Instead of the lack of data we now are challenged by how to use an increasing deluge of it. We have entered the era of ‘Big Data’.
Big data is inextricably linked to Industry 4. These terms have become part of the new lexicon of the digital era when discussing automated manufacturing and production, AI and the IoT.
What is big data and how have we arrived at the current situation?
First, there is no rigorous definition of big data but generally it refers to anything that can be carried out more precisely on a large scale that is not possible to do with the same accuracy on a small scale. The ramifications of big data are far-reaching. It has overturned many established practices, challenged the way we live and interact and enhanced the decision-making and planning processes. But the most profound implication will be its influence in advancing artificial intelligence (AI) in all fields of human endeavour. It is this that is providing the greatest challenge to society.
Historical Origins
The invention of the World Wide Web at CERN in 1989 which led to the Internet was a landmark breakthrough for high energy particle physics. The coming of powerful particle accelerators enabled large-scale complex experiments to be carried out. They produced vast amounts of data that required distribution to physicists around the world for analysis by the so-called super computers that were then uncommon. Coupled later in the 1990s to the development and increased availability of large computer processors, data collection and analysis took a huge leap forward. The development of microsystems technologies enabled the production of multi-sensors arrays that greatly extended methods of collecting data. Embedded microsensors in every device is now an essential part of IoT. They have advanced smart monitoring across many fields. These include wearable electronics, environmental monitoring, transport systems, production lines and medical diagnostics.
The fields of astronomy, weather forecasting, genomics and medicine soon gained the advantage of these developments. Powerful astronomical telescopes, orbiting communication and observation satellites, and high resolution microscopes now produce unparalleled visions of the Universe, global communications, and new insights at the nanoscale of living cells. The integration of on-line computers to such devices has enabled the speedy analysis of massive amounts of data that can be distributed and shared around the world. It has revolutionalised weather forecasting and saved lives and property from being damaged by storms etc.
The Current Situation
Advanced detection, sensing and measuring and systems are generators of big data but the challenge facing us now is how to use and handle the data they provide. This ability has now migrated across the spectrum of almost all human activity outside of science research. Since we are approaching the limits to what the human brain can accommodate, AI is the next step. Already decision-making machines are in operation in smart factories. Big data solutions allow decision-making times to be accelerated and business processes to be optimized so helping manufacturers to stay competitive in the global market particularly for new products.
Google with its massive on-line databases has been become the global encyclopedia. Its GPS positioning and maps are accessible by any device anywhere and is perhaps the best example of the general use of big data.
Perhaps the most significance business outcome of big data is in marketing. Companies who sell products want to know their markets and who might want to buy them. Feedback from potential buyers is imperative for making the right advertising and market decisions.
Through social media and on-line companies such as Google, Amazon and Facebook, such information can be captured. Every purchase or an enquiry that is made is logged in a database that enable customer profiles to be recorded. This methodology is not new but the ability to reach out to millions, or even billions, of people has changed the paradigm. It converts qualitative data into quantitative data thus giving it greater credibility and reduces risk when making decisions.
Statistical Data
Let’s examine some published statistics, taken from a variety of sources, for what big data is before looking at future challenges. Big data sets provide information. They can refer to anything from a few terabytes (109) to many petabytes (1012) of data.
Over the past year, the world’s computers, mobile phones and other devices generated an estimated 12 Zettabytes ~(1021 bytes) of information. Doubling in size every year, this is forecast to grow to around 44 zettabytes by 2020 with a corresponding increase in the amount of data transmitted over communications networks from 1 to 2.3 zettabytes. The total mobile traffic including smartphones will be 30 exabytes ~ (1018) bytes. Much of this will be derived from home appliances, cars, wearables electronics as they become part of the IoT. These predictions are already happening in some parts of the more advanced technical world.
Most new created data between now and 2020 will be produced by machines as they communicate to each other over data networks. It is estimated by then there could be over 80 billion data generating stations. This poses immense safety and security problems for everybody. Corrupted data will cause havoc to a world totally connected so new forms of cyber-security are urgently needed.
Most data transmission is through wireless networks with frequencies up to 300 GHz. This part of the electromagnetic spectrum will soon become saturated so new frequencies will have to be exploited. This could be the visible light part whose frequency extends from 430 to 770 THz, more than 1000 time the bandwidth of the RF portion. Fibre optic networks are already being used for faster data transmission. Work is in progress using high brightness lasers to transmit through air so in the future light stations could be established on masts alongside existing transmitters.
Challenges, Danger and Future Directions
In some previous articles I highlighted robotic manufacturing, smart cities, driverless cars and AI. Big data is the blood that will feed these new ventures and keep them alive. Without its flow, all will collapse and society will regress backwards. But the increasing dependence on big data does bring hidden dangers for society.
Data brings increased knowledge to everyone. The challenge is how to use it beneficially. The thinker Francis Bacon once said that ‘knowledge is power’. But power can corrupt those who seek to use it for personal gain.
I am reminded of the sci-fi film, ‘Minority Report’ where Tom Cruise is a policeman who used big data to predict where the next crime would be committed and who would commit it so identifying potential criminals.
In the film, innocent people were targeted for the wrong reasons. Society must be prevented from using big data for such purposes since that would take it in the wrong direction.
But there is no turning back. Big data is bringing the future closer.
David Tolfree is currently the MANCEF Vice President. He is a professional physicist with 40 years’ research and managerial experience working for the UK’s Atomic Energy Authority and Research Councils. He was the co-founder and director of Technopreneur Ltd, a technical consultancy company for the commercial exploitation of micro/nanotechnologies and a consultant to UK government departments on micro/nanotechnologies. He is one of the founding members of MANCEF and the UK Institute of Nanotechnology and is now a member of the UK KTN. David has 169 publications, including roadmaps, newspaper, conference proccedings, magazine and journal articles and books. He has given interviews on television and radio on micro/nanotechnologies, and is an editor and reviewer for a number of related scientific journals. He currently serves on the editorial Advisory Board of the International Commercial Micro Manufacturing Magazine.