Caltrain Quantified: An Exploration in IoT

Posted by Andrey on July 14, 2015 at 3:24 in Devices

Guest blog post by Cameron Turner

Executive Summary

Though often the focus of the urban noise debate, Caltrain is one of many contributors to overall sound levels along the Bay Area’s peninsula corridor. In this investigation, Cameron Turner of Palo Alto’s The Data Guild takes a look at this topic using a custom-built Internet of Things (IoT) sensor atop the Helium networking platform.

Introduction

If you live in (or visit) the Bay Area, chances are you have experience with the Caltrain. Caltrain is a commuter line which travels 77.4 miles between San Francisco and San Jose , carrying over 50 thousand passengers on over 70 trains daily.[1]

I’m lucky to live two blocks from the Caltrain line, and enjoy the convenience of the train. My office, The Data Guild, is just one block away. The Caltrain and its rhythms, bells and horns are a part of our daily life, and connect us to the City and with connections to BART, Amtrak, SFO and SJC, the rest of the world.

Over the holidays, my 4-year-old daughter and I undertook a project to quantify the Caltrain through a custom-built sensor and reporting framework, to get some first-hand experience in the so-called Internet of Things (IoT). This project also aligns with The Data Guild’s broader ambition to build out custom sensor systems atop network technologies to address global issues. (More on this here.)

Let me note here that this project was an exploration, and was not conducted in a manner (in goals or methodology) to provide fodder for either side of the many ongoing caltrain debates: the electrification project, quiet zone, or tragic recent deaths on the tracks.

Background

My interest in such a project began with an article published in the Palo Alto Daily in October 2014. The article addressed the call for a quiet zone in downtown Palo Alto, following complaints from residents of buildings closest to the tracks. Many subjective frustrations were made by residents based on personal experience.

According the the Federal Railroad Administration (FRA), the rules by which Caltrain operates, train engineers “must begin to sound train horns at least 15 seconds, and no more than 20 seconds, in advance of all public grade crossings.”

Additionally: “Train horns must be sounded in a standardized pattern of 2 long, 1 short and 1 long blasts.” and “The maximum volume level for the train horn is 110 decibels which is a new requirement. The minimum sound level remains 96 decibels.“

Questions

Given the numeric nature of the rules, and the subjective nature of current analysis/discussion, it seemed an ideal problem to address with data. Some of the questions we hoped to address including and beyond this issue:

Timing: Are train horns sounded at the appropriate time?
Schedule: Are Caltrains coming and going on time?
Volume: Are the Caltrain horns sounding at the appropriate level?
Relativity: How do Caltrain horns contribute to overall urban noise levels?

Methodology

Our methodology to address these topics included several steps:

Build a custom sensor equipped to capture ambient noise levels
Leverage an uplink capability to receive data from the sensor in near real-time
Deploy sensor then monitor sensor output and test/modify as needed
Develop a crude statistical model to convert sensor levels (voltage) to sound levels (dB)
Analysis and reporting

Apparatus

We developed a simple sensor based on the Arduino platform. A baseline Uno board, equipped with a local ATmega328 processor, was wired to and Adafruit Electret Microphone/Amplifier 4466 w/adjustable gain.

We were lucky to be introduced through the O’Reilly Strata NY event to a local company: Helium. Backed by Khosla Ventures et al, Helium is building an internet of things platform for smart machines. They combine a wireless protocol optimized for device and sensor data with cloud-based tooling for working with the data and building applications.

We received a Beta Kit which included a Arduino shield for uplink to their bridge device, which then connects via GSM to the Internet. Here is our sensor (left) with the Helium bridge device (right).

Deployment

With our instrument ready for deployment, we sought to find a safe location to deploy. By good fortune, a family friend (and member of the staff of the Stanford Statistics department, where I am completing my degree) owns a home immediately adjacent to a Caltrain crossing, where Caltrain operators are required to sound their horn.

Conductors might also be particularly sensitive to this crossing, Churchill St., due to its proximity to Palo Alto High School and the tragic train-related death of a teen, recently.

From a data standpoint, this location was ideal as it sits approximately half-way between the Palo Alto and California Avenue stations.

We deployed our sensor outdoors facing the track in a waterproof enclosure and watched the first data arrive.

Monitoring

Through a connector to Helium’s fusion platform, we were able to see data in near real-time. (note the “debug” window on the right, where microphone output level arrives each second).

We used another great service, provided by Librato, (now a part of SolarWinds) a San Francisco-based monitoring and metrics company. Using Librato, we enabled data visualization of the sound levels as they were generated. We were able to view this relative to its history. This was a powerful capability as we worked to fine-tune the power and amplifier.

Note the spike in the middle of the image above, which we could map to a train horn heard ourselves during the training period.

Data Preparation

Next, we took a weekday (January 7, 2015), which appeared typical of a non-holiday weekday relative to the entire month of data collected. For this period, we were able to construct a 24-hour data set at 1-second sample intervals for our analysis.

Data was accessed through the Librato API, downloaded as JSON, converted to CSV and cleansed.

Analysis

First, to gain intuition, we took a sample recording gathered at the sensor site of a typical train horn.

Click HERE to hear the sample sound.

Using matplotlib within an ipython notebook, we are able to “see” this sound, in both its raw audio form and as a spectrogram showing frequency:

Next, we look at our entire 24 hours of data, beginning on the evening of January 6, and concluding 24 hours later on the evening of January 7th. Note the quiet “overnight” period, about a quarter of the way across the x axis.

To put this into context, we overlay the Caltrain schedule. Given the sensor sits between the Palo Alto and California Avenue stations, and given the variance in stop times, we mark northbound trains using the scheduled stop at Palo Alto (red), and southbound trains using the scheduled stop at California Ave (green).

Initially, we can make two converse observations: many peak sound events tend to lie quite close to these stop times, as expected. However: many of the sound events (including the maximum recorded value, the nightly ~11pm freight train service) occur independent of the scheduled Caltrains.

Conversion to Decibels

On the Y axis above, the sound level is reported in the raw voltage output from the Microphone. To address the questions above we needed a way to convert these values to decibel units (dB).

To do so, a low-cost sound meter was obtained from Fry’s. Then an on-site calibration was performed to map decibel readings from the sensor to the voltage output uploaded from our microphone.

Within R Studio, these values were plotted and a crude estimation function was derived to create a linear mapping between voltage and dB:

The goal of doing a straight line estimate vs. log-linear was to compensate for differences in apparatus (dB meter vs. microphone within casing) and overall to maintain conservative approximations. Most of the events in question during the observation period were between 2.0 and 2.5 volts, where we collected several training points (above).

A challenge in this process was the slight lag between readings and data collection with unknown variance. As such, only “peak” and “trough” measurements could be used reliably to build the model.

With this crude conversion estimator in hand, we would now replot the data above with decibels on the y axis.

Clearly the “peaks” above are of interest as outliers from the baseline noise level at this site. In fact, there are 69 peaks (>82 dB) observed (at 1-second sample rate), and 71 scheduled trains for this same period. Though this location was about 100 yards removed from the tracks, the horns are quieter than the recommended 96dB-115dB range recommended by the FRA. (With caveat above re: crude approximator)

Interesting also that we’re not observing the “two long-two short-one long” pattern. Though some events are lost to the sampling rate, qualitatively this does not seem to be a standard practice followed by the engineers. Those who live in Palo Alto also know this to be true, qualitatively.

Also worth noting is the high variance of ambient noise, the central horizontal blue “cloud” above, ranging from ~45 dB to ~75 dB. We sought to understand the nature of this variance and whether it contained structure.

Looking more closely at just a few minutes of data during the Jan 7 morning commute, we can see that indeed there is a periodic structure to the variance.

In comparing to on-site observations, we could determine that this period was defined by the traffic signal which sits between the sensor and the train tracks, on Alma St. Additionally, we often observe an “M” structure (bimodal peak) indicating the southbound traffic accelerating from the stop line when the light turned green, followed by the passing northbound traffic seconds later.

Looking at a few minutes of the same morning commute, we can clearly see when the train passed and sounded its horn. Here again, green indicates a southbound train, red indicates and northbound train.

In this case, the southbound train passed slightly before its scheduled arrival time at the California Avenue station, and the Northbound train passed within its scheduled arrival minute, both on time. Note also the peak unassociated with the train. We’ll discuss this next.

Perhaps a more useful summary of the data collected is shown as a histogram, where the decibels are shown on the X axis and the frequency (count) is shown on the Y axis.

We can clearly see a bimodal distribution, where sound is roughly normally distributed, with a second distribution at the higher end. The question still remained why several of the peak observed values fell nowhere near the scheduled train time?

The answer here requires no sensors: airplanes, sirens and freight trains are frequent noise sources in Palo Alto. These factors, coupled with a nearby residential construction project accounted for the non-regular noise events we observed.

Click HERE to hear a sample sound.

Finally, we subsetted the data into three groups, one to look at non-Train minutes, one to look at northbound train minutes and one to look at southbound train minutes. The mean dB levels were 52.13, 52.18 and 52.32 respectively. While the order here makes sense, these samples bury the outcome since a horn blast may only be one second of a train-minute. The difference between northbound and southbound are consistent with on-site observation-- given the sensor lies on the northeast corner of the crossing, horn blasts from southbound trains were more pronounced.

Conclusion

Before making any conclusions it should be noted again that these are not scientific findings, but rather an attempt to add some rigor to the discussion around Caltrain and noise pollution. Further study with a longer period of analysis and duplicity of data collection would be required to statistically state these conclusions.

That said, we can readdress the topics in question:

Timing: Are train horns sounded at the appropriate time?

The FRA recommends engineers sound their horn between 15 and 20 seconds before a crossing. Given the tight urban nature of this crossing this recommendation seems a misfit. Caltrain engineers are sounding within 2-3 seconds of the crossing, which seems more appropriate.

Schedule: Are Caltrains coming and going on time?

Though not explored in depth here, generally we can observe that trains are passing our sensor prior to their scheduled arrival at the upcoming station.

Volume: Are the Caltrain horns sounding at the appropriate level?

As discussed above, the apparent dB level at a location very close to the track was well below the FRA recommended levels.

Relativity: How do Caltrain horns contribute to overall urban noise levels?

The Caltrain horns generate roughly an additional 10dB to peak baseline noise levels, including period traffic events at the intersection observed.

Opinions

Due to their regular frequency and physical presence, trains are an easy target when it comes to urban sound attenuation efforts. However, the regular oscillations of traffic, sirens, airplanes and construction create a very high, if not predictable baseline above which trains must be heard.

Considering the importance of safety to this system, which operates just inches from bikers, drivers and pedestrians, there is a tradeoff to be made between supporting quiet zone initiatives and the capability of speeding trains to be heard.

In Palo Alto, as we move into an era of electric cars, improved bike systems and increased pedestrian access, the oscillations of noise created by non-train activities may indeed subside over time. And this in turn, might provide an opportunity to lower the “alert sounds” such as sirens and train horns required to deliver these services safely. Someday much of our everyday activity might be accomplished quietly.

Until then, we can only appreciate these sounds which must rise above our noisy baseline, as a reminder of our connectedness to the greater bay area through our shared focus on safety and convenient public transportation.

…

Acknowledgements:

Sincere thanks to Helen T. and Nick Parlante of Stanford University, Mark Phillips of Helium and Nik Wekwerth/Jason Derrett/Peter Haggerty of Librato for their help and technical support.

Thanks also to my peers at The Data Guild, Aman, Chris, Dave and Sandy and the Palo Alto Police IT department for their feedback.

And thanks to my daughter Tallulah for her help soldering and moral support.

[1] http://en.wikipedia.org/wiki/Caltrain

Originally posted on LinkedIn.

Follow us @IoTCtrl | Join our Community

Security challenges for IoT

Posted by Andrey on July 13, 2015 at 17:50 in Cloud Platforms and Security

Guest blog post by vozag

Emergence of IoT presents security challenges more challenging than any industrial systems have seen.

Open Web Application Security Project (OWASP) is a reputed international organization which focuses on improving the security of the software. It sponsors the hugely popular Top ten project which publishes the top ten security risks for web applications all over the world.

The “OWASP Internet of Things (IoT) Top 10” project defines the top ten security surface areas presented by IoT systems. The project aims to provide practical security recommendations for builders, breakers, and users of IoT systems.

Last year HP which started this project used it as a baseline to evaluate top ten IoT devices which are were widely used and released a report. The study concluded that on an average each device studied had 25 vulnerabilities listed as a part of project.

The top 10 vulnerabilities impact of each vulnerability and the link in the order listed in project are given below:

Insecure Web Interface

Insecure web interfaces can result in data loss or corruption, lack of accountability, or denial of access and can lead to complete device takeover.

Insufficient Authentication/Authorization

Insufficient authentication/authorization can result in data loss or corruption, lack of accountability, or denial of access and can lead to complete compromise of the device and/or user accounts.

Insecure Network Services

Insecure network services can result in data loss or corruption, denial of service or facilitation of attacks on other devices.

Lack of Transport Encryption

Lack of transport encryption can result in data loss and depending on the data exposed, could lead to complete compromise of the device or user accounts.

Privacy concerns

Collection of personal data along with a lack of protection of that data can lead to compromise of a user's personal data.

Insecure Cloud Interface

An insecure cloud interface could lead to compromise of user data and control over the device.

Insecure Mobile Interface

An insecure mobile interface could lead to compromise of user data and control over the device.

Insufficient Security Configurability

Insufficient security configurability could lead to compromise of the device whether intentional or accidental and/or data loss.

Insecure_Software/Firmware

Insecure software/firmware could lead to compromise of user data, control over the device and attacks against other devices.

Poor Physical Security

Insufficient physical security could lead to compromise of the device itself and any data stored on that device.

Data Science for IoT: The role of hardware in analytics

Posted by Andrey on July 14, 2015 at 3:19 in OSes

Guest blog post by ajit jaokar

Often, Data Science for IoT differs from conventional data science due to the presence of hardware.

Hardware could be involved in integration with the Cloud or Processing at the Edge (which Cisco and others have called Fog Computing).

Alternately, we see entirely new classes of hardware specifically involved in Data Science for IoT(such as synapse chip for Deep learning)

Hardware will increasingly play an important role in Data Science for IoT.

A good example is from a company called Cognimem which natively implements classifiers(unfortunately, the company does not seem to be active any more as per their twitter feed)

In IoT, speed and real time response play a key role. Often it makes sense to process the data closer to the sensor.

This allows for a limited / summarized data set to be sent to the server if needed and also allows for localized decision making. This architecture leads to a flow of information out from the Cloud and the storage of information at nodes which may not reside in the physical premises of the Cloud.

In this post, I try to explore the various hardware touchpoints for Data analytics and IoT to work together.

Cloud integration: Making decisions at the Edge

Intel Wind River edge management system certified to work with the Intel stack and includes capabilities such as data capture, rules-based data analysis and response, configuration, file transfer and Remote device management

Integration of Google analytics into Lantronix hardware – allows sensors to send real-time data to any node on the Internet or to a cloud based application.

Microchip integration with Amazon Web services uses an embedded application with the Amazon Elastic Compute Cloud (EC2) service. Based on Wi-Fi Client Module Development Kit . Languages like Python or Ruby can be used for development

Integration of Freescale and Oracle which consolidates data collected from multiple appliances from multiple Internet of things service providers.

Libraries

Libraries are another avenue for analytics engines to be integrated into products – often at the point of creation of the device. Xively cloud services is an example of this strategy through xively libraries

APIs

In contrast, keen.io provides APIs for IoT devices to create their own analytics engines ex (smartwatch Pebble’s using of keen.io) without locking equipment providers into a particular data architecture.

Specialized hardware

We see increasing deployment of specialized hardware for analytics. Ex egburt from Camgian which uses sensor fusion technolgies for IoT.

In the Deep learning space, GPUs are widely used and more specialized hardware emerges such asIBM’s synapse chip. But more interesting hardware platforms are emerging such as Nervana Systemswhich creates hardware specifically for Neural networks.

Ubuntu Core and IFTTT spark

Two more initiatives on my radar deserve a space in themselves – even when neither of them have currently an analytics engine: Ubuntu Core – Docker containers+lightweight Linux distribution as an IoT OS and IFTTT spark initiatives

Comments welcome

This post is leading to vision for Data Science for IoT course/certification. Please sign up on the link if you wish to know more when launched in Feb.

Image source: cognimem

Follow us @IoTCtrl | Join our Community

An Introduction to Deep Learning and it’s role for IoT/ future cities

Posted by Andrey on July 14, 2015 at 3:21 in Programming

Guest blog post by ajit jaokar

By Ajit Jaokar @ajitjaokar Please connect with me if you want to stay in touch on linkedin and for future updates

Cross posted from my blog - I look forward to discussion/feedback here

Note: The paper below is best read as a pdf which you can download from the blog for free

Background and Abstract

This article is a part of an evolving theme. Here, I explain the basics of Deep Learning and how Deep learning algorithms could apply to IoT and Smart city domains. Specifically, as I discuss below, I am interested in complementing Deep learning algorithms using IoT datasets. I elaborate these ideas in the Data Science for Internet of Things program which enables you to work towards being a Data Scientist for the Internet of Things (modelled on the course I teach at Oxford University and UPM – Madrid). I will also present these ideas at the International conference on City Sciences at Tongji University in Shanghai and the Data Science for IoT workshop at the Iotworld event in San Francisco

Please connect with me if you want to stay in touch on linkedin and for future updates

Deep Learning

Deep learning is often thought of as a set of algorithms that ‘mimics the brain’. A more accurate description would be an algorithm that ‘learns in layers’. Deep learning involves learning through layers which allows a computer to build a hierarchy of complex concepts out of simpler concepts.

The obscure world of deep learning algorithms came into public limelight when Google researchers fed 10 million random, unlabeled images from YouTube into their experimental Deep Learning system. They then instructed the system to recognize the basic elements of a picture and how these elements fit together. The system comprising 16,000 CPUs was able to identify images that shared similar characteristics (such as images of Cats). This canonical experiment showed the potential of Deep learning algorithms. Deep learning algorithms apply to many areas including Computer Vision, Image recognition, pattern recognition, speech recognition, behaviour recognition etc

How does a Computer Learn?

To understand the significance of Deep Learning algorithms, it’s important to understand how Computers think and learn. Since the early days, researchers have attempted to create computers that think. Until recently, this effort has been rules based adopting a ‘top down’ approach. The Top-down approach involved writing enough rules for all possible circumstances. But this approach is obviously limited by the number of rules and by its finite rules base.

To overcome these limitations, a bottom-up approach was proposed. The idea here is to learn from experience. The experience was provided by ‘labelled data’. Labelled data is fed to a system and the system is trained based on the responses. This approach works for applications like Spam filtering. However, most data (pictures, video feeds, sounds, etc.) is not labelled and if it is, it’s not labelled well.

The other issue is in handling problem domains which are not finite. For example, the problem domain in chess is complex but finite because there are a finite number of primitives (32 chess pieces) and a finite set of allowable actions(on 64 squares). But in real life, at any instant, we have potentially a large number or infinite alternatives. The problem domain is thus very large.

A problem like playing chess can be ‘described’ to a computer by a set of formal rules. In contrast, many real world problems are easily understood by people (intuitive) but not easy to describe (represent) to a Computer (unlike Chess). Examples of such intuitive problems include recognizing words or faces in an image. Such problems are hard to describe to a Computer because the problem domain is not finite. Thus, the problem description suffers from the curse of dimensionality i.e. when the number of dimensions increase, the volume of the space increases so fast that the available data becomes sparse. Computers cannot be trained on sparse data. Such scenarios are not easy to describe because there is not enough data to adequately represent combinations represented by the dimensions. Nevertheless, such ‘infinite choice’ problems are common in daily life.

How do Deep learning algorithms learn?

Deep learning is involved with ‘hard/intuitive’ problem which have little/no rules and high dimensionality. Here, the system must learn to cope with unforeseen circumstances without knowing the Rules in advance. Many existing systems like Siri’s speech recognition and Facebook’s face recognition work on these principles. Deep learning systems are possible to implement now because of three reasons: High CPU power, Better Algorithms and the availability of more data. Over the next few years, these factors will lead to more applications of Deep learning systems.

Deep Learning algorithms are modelled on the workings of the Brain. The Brain may be thought of as a massively parallel analog computer which contains about 10^10 simple processors (neurons) – each of which require a few milliseconds to respond to input. To model the workings of the brain, in theory, each neuron could be designed as a small electronic device which has a transfer function similar to a biological neuron. We could then connect each neuron to many other neurons to imitate the workings of the Brain. In practise, it turns out that this model is not easy to implement and is difficult to train.

So, we make some simplifications in the model mimicking the brain. The resultant neural network is called “feed-forward back-propagation network”. The simplifications/constraints are: We change the connectivity between the neurons so that they are in distinct layers. Each neuron in one layer is connected to every neuron in the next layer. Signals flow in only one direction. And finally, we simplify the neuron design to ‘fire’ based on simple, weight driven inputs from other neurons. Such a simplified network (feed-forward neural network model) is more practical to build and use.

Thus:

a) Each neuron receives a signal from the neurons in the previous layer

b) Each of those signals is multiplied by a weight value.

c) The weighted inputs are summed, and passed through a limiting function which scales the output to a fixed range of values.

d) The output of the limiter is then broadcast to all of the neurons in the next layer.

Image and parts of description in this section adapted from : Seattle robotics site

The most common learning algorithm for artificial neural networks is called Back Propagation (BP) which stands for “backward propagation of errors”. To use the neural network, we apply the input values to the first layer, allow the signals to propagate through the network and read the output. A BP network learns by example i.e. we must provide a learning set that consists of some input examples and the known correct output for each case. So, we use these input-output examples to show the network what type of behaviour is expected. The BP algorithm allows the network to adapt by adjusting the weights by propagating the error value backwards through the network. Each link between neurons has a unique weighting value. The ‘intelligence’ of the network lies in the values of the weights. With each iteration of the errors flowing backwards, the weights are adjusted. The whole process is repeated for each of the example cases. Thus, to detect an Object, Programmers would train a neural network by rapidly sending across many digitized versions of data (for example, images) containing those objects. If the network did not accurately recognize a particular pattern, the weights would be adjusted. The eventual goal of this training is to get the network to consistently recognize the patterns that we recognize (ex Cats).

How does Deep Learning help to solve the intuitive problem

The whole objective of Deep Learning is to solve ‘intuitive’ problems i.e. problems characterized by High dimensionality and no rules. The above mechanism demonstrates a supervised learning algorithm based on a limited modelling of Neurons – but we need to understand more.

Deep learning allows computers to solve intuitive problems because:

With Deep learning, Computers can learn from experience but also can understand the world in terms of a hierarchy of concepts – where each concept is defined in terms of simpler concepts.
The hierarchy of concepts is built ‘bottom up’ without predefined rules by addressing the ‘representation problem’.

This is similar to the way a child learns ‘what a dog is’ i.e. by understanding the sub-components of a concept ex the behavior(barking), shape of the head, the tail, the fur etc and then putting these concepts in one bigger idea i.e. the Dog itself.

The (knowledge) representation problem is a recurring theme in Computer Science.

Knowledge representation incorporates theories from psychology which look to understand how humans solve problems and represent knowledge. The idea is that: if like humans, Computers were to gather knowledge from experience, it avoids the need for human operators to formally specify all of the knowledge that the computer needs to solve a problem.

For a computer, the choice of representation has an enormous effect on the performance of machine learning algorithms. For example, based on the sound pitch, it is possible to know if the speaker is a man, woman or child. However, for many applications, it is not easy to know what set of features represent the information accurately. For example, to detect pictures of cars in images, a wheel may be circular in shape – but actual pictures of wheels may have variants (spokes, metal parts etc). So, the idea of representation learning is to find both the mapping and the representation.

If we can find representations and their mappings automatically (i.e. without human intervention), we have a flexible design to solve intuitive problems. We can adapt to new tasks and we can even infer new insights without observation. For example, based on the pitch of the sound – we can infer an accent and hence a nationality. The mechanism is self learning. Deep learning applications are best suited for situations which involve large amounts of data and complex relationships between different parameters. Training a Neural network involves repeatedly showing it that: “Given an input, this is the correct output”. If this is done enough times, a sufficiently trained network will mimic the function you are simulating. It will also ignore inputs that are irrelevant to the solution. Conversely, it will fail to converge on a solution if you leave out critical inputs. This model can be applied to many scenarios as we see below in a simplified example.

An example of learning through layers

Deep learning involves learning through layers which allows a computer to build a hierarchy of complex concepts out of simpler concepts. This approach works for subjective and intuitive problems which are difficult to articulate.

Consider image data. Computers cannot understand the meaning of a collection of pixels. Mappings from a collection of pixels to a complex Object are complicated.

With deep learning, the problem is broken down into a series of hierarchical mappings – with each mapping described by a specific layer.

The input (representing the variables we actually observe) is presented at the visible layer. Then a series of hidden layers extracts increasingly abstract features from the input with each layer concerned with a specific mapping. However, note that this process is not pre defined i.e. we do not specify what the layers select

For example: From the pixels, the first hidden layer identifies the edges

From the edges, the second hidden layer identifies the corners and contours

From the corners and contours, the third hidden layer identifies the parts of objects

Finally, from the parts of objects, the fourth hidden layer identifies whole objects

Image and example source: Yoshua Bengio book – Deep Learning

Implications for IoT

To recap:

Deep learning algorithms apply to many areas including Computer Vision, Image recognition, pattern recognition, speech recognition, behaviour recognition etc
Deep learning systems are possible to implement now because of three reasons: High CPU power, Better Algorithms and the availability of more data. Over the next few years, these factors will lead to more applications of Deep learning systems.
Deep learning applications are best suited for situations which involve large amounts of data and complex relationships between different parameters.
Solving intuitive problems: Training a Neural network involves repeatedly showing it that: “Given an input, this is the correct output”. If this is done enough times, a sufficiently trained network will mimic the function you are simulating. It will also ignore inputs that are irrelevant to the solution. Conversely, it will fail to converge on a solution if you leave out critical inputs. This model can be applied to many scenarios

In addition, we have limitations in the technology. For instance, we have a long way to go before a Deep learning system can figure out that you are sad because your cat died(although it seems Cognitoys based on IBM watson is heading in that direction). The current focus is more on identifying photos, guessing the age from photos(based on Microsoft’s project Oxford API)

And we have indeed a way to go as Andrew Ng reminds us to think of Artificial Intelligence as building a rocket ship

“I think AI is akin to building a rocket ship. You need a huge engine and a lot of fuel. If you have a large engine and a tiny amount of fuel, you won’t make it to orbit. If you have a tiny engine and a ton of fuel, you can’t even lift off. To build a rocket you need a huge engine and a lot of fuel. The analogy to deep learning [one of the key processes in creating artificial intelligence] is that the rocket engine is the deep learning models and the fuel is the huge amounts of data we can feed to these algorithms.”

Today, we are still limited by technology from achieving scale. Google’s neural network that identified cats had 16,000 nodes. In contrast, a human brain has an estimated 100 billion neurons!

There are some scenarios where Back propagation neural networks are suited

A large amount of input/output data is available, but you’re not sure how to relate it to the output. Thus, we have a larger number of “Given an input, this is the correct output” type scenarios which can be used to train the network because it is easy to create a number of examples of correct behaviour.
The problem appears to have overwhelming complexity. The complexity arises from Low rules base and a high dimensionality and from data which is not easy to represent. However, there is clearly a solution.
The solution to the problem may change over time, within the bounds of the given input and output parameters (i.e., today 2+2=4, but in the future we may find that 2+2=3.8) and Outputs can be “fuzzy”, or non-numeric.
Domain expertise is not strictly needed because the output can be purely derived from inputs: This is controversial because it is not always possible to model an output based on the input alone. However, consider the example of stock market prediction. In theory, given enough cases of inputs and outputs for a stock value, you could create a model which would predict unknown scenarios if it was trained adequately using deep learning techniques.
Inference: We need to infer new insights without observation. For example, based on the pitch of the sound – we can infer an accent and hence a nationality

Given an IoT domain, we could consider the top-level questions:

What existing applications can be complemented by Deep learning techniques by adding an intuitive component? (ex in smart cities)
What metrics are being measured and predicted? And how could we add an intuitive component to the metric?
What applications exist in Computer Vision, Image recognition, pattern recognition, speech recognition, behaviour recognition etc which also apply to IoT

Now, extending more deeply into the research domain, here are some areas of interest that I am following.

Complementing Deep Learning algorithms with IoT datasets

In essence, these techniques/strategies complement Deep learning algorithms with IoT datasets.

1) Deep learning algorithms and Time series data : Time series data (coming from sensors) can be thought of as a 1D grid taking samples at regular time intervals, and image data can be thought of as a 2D grid of pixels. This allows us to model Time series data with Deep learning algorithms (most sensor / IoT data is time series). It is relatively less common to explore Deep learning and Time series – but there are some instances of this approach already (Deep Learning for Time Series Modelling to predict energy loads using only time and temp data )

2) Multiple modalities: multimodality in deep learning. Multimodality in deep learning algorithms is being explored In particular, cross modality feature learning, where better features for one modality (e.g., video) can be learned if multiple modalities (e.g., audio and video) are present at feature learning time

3) Temporal patterns in Deep learning: In their recent paper, Ph.D. student Huan-Kai Peng and Professor Radu Marculescu, from Carnegie Mellon University’s Department of Electrical and Computer Engineering, propose a new way to identify the intrinsic dynamics of interaction patterns at multiple time scales. Their method involves building a deep-learning model that consists of multiple levels; each level captures the relevant patterns of a specific temporal scale. The newly proposed model can be also used to explain the possible ways in which short-term patterns relate to the long-term patterns. For example, it becomes possible to describe how a long-term pattern in Twitter can be sustained and enhanced by a sequence of short-term patterns, including characteristics like popularity, stickiness, contagiousness, and interactivity. The paper can be downloaded HERE

Implications for Smart cities

I see Smart cities as an application domain for Internet of Things. Many definitions exist for Smart cities/future cities. From our perspective, Smart cities refer to the use of digital technologies to enhance performance and wellbeing, to reduce costs and resource consumption, and to engage more effectively and actively with its citizens (adapted from Wikipedia). Key ‘smart’ sectors include transport, energy, health care, water and waste. A more comprehensive list of Smart City/IoT application areas are: Intelligent transport systems – Automatic vehicle , Medical and Healthcare, Environment , Waste management , Air quality , Water quality, Accident and Emergency services, Energy including renewable, Intelligent transport systems including autonomous vehicles. In all these areas we could find applications to which we could add an intuitive component based on the ideas above.

Typical domains will include Computer Vision, Image recognition, pattern recognition, speech recognition, behaviour recognition. Of special interest are new areas such as the Self driving cars – ex theLutz pod and even larger vehicles such as self driving trucks

Conclusions

Deep learning involves learning through layers which allows a computer to build a hierarchy of complex concepts out of simpler concepts. Deep learning is used to address intuitive applications with high dimensionality. It is an emerging field and over the next few years, due to advances in technology, we are likely to see many more applications in the Deep learning space. I am specifically interested in how IoT datasets can be used to complement deep learning algorithms. This is an emerging area with some examples shown above. I believe that it will have widespread applications, many of which we have not fully explored(as in the Smart city examples)

I see this article as part of an evolving theme. Future updates will explore how Deep learning algorithms could apply to IoT and Smart city domains. Also, I am interested in complementing Deep learning algorithms using IoT datasets.

I elaborate these ideas in the Data Science for Internet of Things program (modelled on the course I teach at Oxford University and UPM – Madrid). I will also present these ideas at the International conference on City Sciences at Tongji University in Shanghai and the Data Science for IoT workshop at the Iotworld event in San Francisco

Please connect with me if you want to stay in touch on linkedin and for future updates

Follow us @IoTCtrl | Join our Community

List of IoT Platforms

Posted by vozag on July 24, 2015 at 13:00 in Platforms

IoT platforms make the developer’s life easier by offering some independent functionality which can be used by the applications they write to achieve their objective. Saving them from the task of reinventing the wheel. Given here is a list of useful IoT platforms.

Kaa

Kaa is a flexible open source platform licenced under Apache 2.0 for building, managing, and integrating connected software in IoT. Kaa’s “data schema” definition language provides a universal level of abstraction to achieve cross-vendor product interoperability. Kaa supports multiple client platforms by offering endpoint SDKs in various programming languages. In addition, Kaa’s powerful back-end functionality greatly speeds up product development, allowing vendors to concentrate on maximizing their product’s unique value to the consumer.

Axeda

The Axeda Platform is a complete M2M and IoT data integration and application development platform with infrastructure delivered as a cloud-based service.

Arrayent

The Arrayent Connect Platform is an IoT platform that helps to connect products to smartphones and web applications. It comes with an an agent which helps the embedded devices to connect to cloud, A cloud based IoT operating system, A mobile framework and a business intelligence reporting system

Carriots

Carriots is a Platform as a Service (PaaS) designed for Internet of Things (IoT) and Machine to Machine (M2M) projects. It provides tools to Collect & store data from devices, SDK to build powerful applications, deploy and scale from tiny prototypes to thousands of devices

Xively

Xively offers an enterprise IoT platform which helps in connecting products and users, manage the information and an interface to for product deployment and health check

ThingSpeak

ThingSpeak is an open source Internet of Things application and API to store and retrieve data from things using the HTTP protocol over the Internet or via a Local Area Network. ThingSpeak enables the creation of sensor logging applications, location tracking applications, and a social network of things with status updates

The Intel® IoT Platform

The Intel® IoT Platform is an end-to-end reference model and family of products from Intel, that works with third party solutions to provide a foundation for seamlessly and securely connecting devices, delivering trusted data to the cloud, and delivering value through analytics.

A votable & rankable list of these platform can be found at Vozag

Will Javascript be the Language of IoT?

Posted by David Oro on September 3, 2015 at 23:45 in Programming

JavaScript has proven itself worthy for web applications, both client and server side, but does it have potential to be the de-facto language of IoT?

This is a topic I posed to Patrick Catanzariti, founder of DevDiner.com, a site for developers looking to get involved in emerging tech. Patrick is a regular contributor and curator of developer news and opinion pieces on new technology such as the Internet of Things, virtual/augmented reality and wearables. He is a SitePoint contributing editor, an instructor at SitePoint Premium and O'Reilly, a Meta Pioneer and freelance web developer who loves every opportunity to tinker with something new in a tech demo.

Why does IoT require a de facto language any more than any other system? Wouldn't that stifle future language evolution?

Honestly, I think it's a bit too much to ask for every single IoT device out there to run on JavaScript or any one de facto language. That's unbelievably tough to manage. Getting the entire world of developers to agree on anything is pretty difficult. Whatever solution the world of competing tech giants and startups come to (which is likely to be a rather fragmented one if current trends are anything to go by), the most important thing is that these devices need to be able to communicate effectively with each other and with as little barriers as possible. They need to work together. It's the "Internet of Things". The entire benefit of connecting anything to the Internet is allowing it to speak to other devices at a massive scale. I think we'd be able to achieve this goal even with a variety of languages powering the IoT. So from that standpoint, I think it's totally okay for various devices to run on whichever programming language suits them best.

On the other hand, we need to honestly look at the future of this industry from a developer adoption and consistency perspective. The world of connected devices is going to skyrocket. We aren't talking about a computer in every home, we're talking dozens of interconnected devices in every home. If each one of those devices is from a different company who each decided on a different programming language to use, things are going to get very tough to maintain. Are we going to expect developers to understand all programming languages like C, C++, JavaScript, Java, Go, Python, Swift and more to be able to develop solutions for the IoT? Whilst I'm not saying that's impossible to do and I'm sure there'll be programmers up to the task of that - I worry that will impact the quality of our solutions. Every language comes with its quirks and best practices, it'll be tough to ensure every developer knows how to create best practice software for every language. Managing the IoT ecosystem might become a costly and difficult endeavour if it is that fragmented.

I've no issue with language evolution, however if every company decides to start its own language to better meet the needs of the IoT, we're going to be in a world of trouble too. The industry needs to work together on the difficulties of the IoT, not separately. The efforts of the Open Interconnect Consortium, AllSeen Alliance and IoT Trust Framework are all positive signs towards a better approach.

C, C++ and Java always seem to be foundational languages that are used by all platforms, why do you think JavaScript will be the programming language of IoT?

My position is actually a bit more open than having JavaScript as the sole programming language of the IoT. I don't think that's feasible. JavaScript isn't great as a lower level language for memory management and the complexities of managing a device to that extent. That's okay. We are likely to have a programming language more suited to that purpose, like C or C++, as the de facto standard operational language. That would make perfect sense and has worked for plenty of devices so far. The issues I see are in connecting these devices together nicely and easily.

My ideal world would involve having devices running on C or C++ with the ability to also run JavaScript on top for the areas in which JavaScript is strongest. The ability to send out messages in JSON to other devices and web applications. That ability alone is golden when it comes to parsing messages easily and quickly. The Internet can speak JavaScript already, so for all those times when you need to speak to it, why not speak JavaScript? If you've got overall functionality which you can share between a Node server, front end web application and a dozen connected IoT devices, why not use that ability?

JavaScript works well with the event driven side of things too. When it comes to responding to and emitting events to a range of devices and client web applications at once, JavaScript does this pretty well these days.

JavaScript is also simpler to use, so for a lot of basic functionality like triggering a response on a hardware pin or retrieving data from a sensor, why overcomplicate it? If it's possible to write code that is clear and easy for many developers to understand and use without needing to worry about the lower level side of things - why not? We have a tonne of JavaScript developers out there already building for the web and having them on board to work with joining these devices to their ecosystem of web applications just makes sense.

Basically, I think we're looking at a world where devices run programming languages like C at their core but also can speak JavaScript for the benefits it brings. Very similar to what it looks like IoT.js and JerryScript will bring. I really like the Pebble Smartwatch's approach to this. Their watches run C but their apps use JavaScript for the web connectivity.

When it comes to solutions like IoT.js and JerryScript, they're written initially in C++. However they're providing an entire interface to work with the IoT device via JavaScript. One thing I really like about the IoT.js and JerryScript idea is that I've read that it works with npm - the Node Package Manager. This is a great way of providing access to a range of modules and solutions that already exist for the JavaScript and Node ecosystems. If IoT.js and JerryScript manage memory effectively and can provide a strong foundation for all the low level side of things, then it could be a brilliant way to help make developing for the IoT easier and more consistent with developing for the web with all the benefits I mentioned earlier. It would be especially good if the same functionality was ported to other programming languages too, that would be a fantastic way of getting each IoT device to some level of compatibility and consistency.

I'm hoping to try IoT.js and JerryScript out on a Raspberry Pi 2 soon, I'm intrigued to see how well it runs everything.

What do developers need to consider when building apps for IoT?

Security - If you are building an IoT device which is going to ship out to thousands of people, think security first. Make sure you have a way of updating all of those devices remotely (yet securely) with a security fix if something goes wrong. There will be bugs in your code. Security vulnerabilities will be found in even the most core technologies you are using. You need to be able to issue patches for them!

Battery life - If everyone needs to change your brand of connected light bulbs every two months because they run out of juice - that affects the convenience of the IoT. IoT devices need to last a long time. They need to be out of the way. Battery life is crucial. Avoid coding things in a way which drains battery power unnecessarily.

Compatibility - Work towards matching a standard like the Open Interconnect Consortium or AllSeen Alliance. Have your communication to other devices be simple and open so that your users can benefit from the device working with other IoT devices in new and surprising ways. Don't close it off to your own ecosystem!

What tools do you recommend for developing apps in IoT?

I'm a fan of the simple things. I still use Sublime Text for my coding most of the time as it's simple and out of the way, yet supports code highlighting for a range of languages and situations. It works well!

Having a portable 4G Wi-Fi dongle is also very very valuable for working on the go with IoT devices. It serves as a portable home network and saves a lot of time as you can bring it around as a development Wi-Fi network you turn on whenever you need it.

Heroku is great as a quick free platform to host your own personal IoT prototypes on too while you're testing them out. I often set up Node servers in Heroku to manage my communication between devices and it is the smoothest process I've found out of all of the hosting platforms so far.

For working locally - I've found a service called ngrok is perfect. It creates a tunnel to the web from your localhost, so you can host a server locally but access it online via a publicly accessible URL while testing. I've got a guide on this and other options like it on SitePoint.

Are you seeing an uptick in demand for IoT developers?

I've seen a demand slowly rising for IoT developers but not much of a developer base that is taking the time to get involved. I think partially it is because developers don't know where to start or don't realise how much of their existing knowledge already applies to the IoT space. It's actually one of the reasons I write at SitePoint as a contributing editor - my goal is to try and get more developers thinking about this space. The more developers out there who are getting involved, the higher the chances we hit those breakthrough ideas that can change the world. I really hope that having devices enabled with JavaScript helps spur on a whole community of developers who've spent their lives focused on the value of interconnected devices and shared information get involved in the IoT.

My latest big website endeavour called Dev Diner (http://www.devdiner.com) aims to try and make it easier for developers to get involved with all of this emerging tech too by providing guides on where to look for information, interviews and opinion pieces to get people thinking. The more developers we get into this space, the stronger we will all be as a community! If you are reading this and you're a developer who has an Arduino buried in their drawer or a Raspberry Pi 2 still in their online shopping cart - just do it. Give it a go. Think outside the box and build something. Use JavaScript if that is your strength. If you're stronger at working with C or C++, work to your strength but know that JavaScript might be a good option to help with the communication side of things too.

For more on Patrick’s thoughts on Javascript, read his blog post “Why JavaScript and the Internet of Things?” and catch his O’Reilly seminar here.

Bytes and Bushels - Farming on an Industrial Scale

Posted by David Oro on September 1, 2015 at 23:11 in Devices

The topic of IoT and farming keeps coming up.

Last month Steve Lohr of the New York Times wrote a fantastic piece on The Internet of Things and the Future of Farming. His colleague Quentin Hardy wrote a similar piece, albeit with a big data slant, in November 2014. If you have not yet read either article, I suggest you take the time to do so and also watch the video of IoT at work at a modern farm. It’s one of the better IoT case studies I’ve come across and shows real and practical applications and results.

Both stories highlight Tom Farms, a multi-generation, family owned farm in North Indiana. The Toms won’t be setting up a stand at your local farmers market to hawk their goods. With over 19,000 acres they are feeding a nice portion of America and conduct farming on an industrial scale producing shipments of more than 30 million pounds of seed corn, 100 million pounds of corn, and 13 million pounds of soybeans each year.

As the video points out, technology, data and connectivity have gotten them to this scale. After the farm crisis of the 1980s, they double-downed and bought more land from other struggling farmers. Along the way they were proactive in researching and developing new production technologies - everything from sensors on the combine, GPS data, self-driving tractors, and apps for irrigation on an iPhone.

Photo Credit: Gary McKenzie on Flickr

All this technology is taking farming to a new level, in what is know as Precision Agriculture. The holy grail of precision agriculture is to optimize returns on inputs while preserving resources. The most common use of of modern farming is used for guiding tractors with GPS. But what other technologies are out there?

For that, the Wall Street Journal explored yesterday startups that put data in farmers' hands. Startups like Farmobile LLC, Granular Inc. and Grower Information Services Cooperative are challenging data-analysis tools from big agricultural companies such as Monsanto Co., DuPont Co., Deere & Co. and Cargill Inc.

The new crop from all of these technologies is data.

This changes the economics for farmers making them not just traders in crops, but in data, potentially giving them an edge against larger competitors that benefit from economies of scale (to compete against giants like Tom Farms).

With the amount of venture investment in so-called agtech start-ups reaching $2.06 billion in the first half of this year there will be plenty of bytes in every bushel.

For a deep dive into Precision Agriculture, the history and the technologies behind it, I suggest registering for and reading the Foreign Affairs article, “The Precision Agriculture Revolution, Making the Modern Farmer.”

Mapping the Internet of Things

Posted by David Oro on August 31, 2015 at 22:44

You would think that in this day and age of infographics that finding a map laying out the ecosystem of the Internet of Things would exist. Surprisingly, a Google search doesn’t appear to return much. Neither does a Twitter a search.

Recently though I found two worth sharing. One from Goldman Sachs and the other from Chris McCann which I found very interesting - A Map of The Internet of Things Market.

Goldman Sachs’ map is pretty generic but it takes IoT related items all the way from the consumer to the Industrial Internet. In a September 2014 report, “The Internet of Things: Making sense of the next mega-trend”, Goldman states that IoT is emerging as the third wave in the development of the Internet. Much of what we hear about today are on the consumer end of the spectrum - early simple products like fitness trackers and thermostats. On the other end of the spectrum, and what I think IoT Central is all about, is the Industrial Internet. The opportunity in the global industrial sector will dwarf consumer spend. Goldman states that industrial is poised to undergo a fundamental structural change akin to the industrial revolution as we usher in the IoT. All equipment will be digitized and more connected and will establish networks between machines, humans, and the Internet, leading to the creation of new ecosystems that enable higher productivity, better energy efficiency, and higher profitability. Goldman predicts that IoT opportunity for Industrials could amount to $2 trillion by 2020.

Chris McCann, who works at Greylock Partners, has an awesome map of the Internet of Things Market (below). This is what venture capitalists do of course - analyze markets and find opportunities for value by understanding the competitive landscape. This map is great because I think it can help IoT practitioners gain a better understanding of the Internet of Things market and how all of the different players fit together.

The map is not designed to be comprehensive, but given the dearth in available guidance, this is a great starting point. The map is heavily geared towards the startup space (remember the author is a VC) and I think he leaves out a few machine-to-machine vendors, software platforms and operating systems.

Other maps I found that are interesting are:

Thingful, a search engine for the Internet of Things. It provides a geographical index of connected objects around the world, including energy, radiation, weather, and air quality devices as well as seismographs. Near me in earthquake prone Northern California I of course found a seismograph, as well as a weather station, and an air quality monitoring station.

Shodan, another search engine of sorts for IoT.

And then there is this story of Rapid7’s HD Moore who pings things just for fun.

If you have any maps that you think are valuable, I would love for you to share them in the comments section.

Follow us on Twitter @IoTCtrl

IoT Security - Hacking Things with the Ridiculous

Posted by David Oro on August 12, 2015 at 23:11

In the 1996 sci-fi blockbuster movie “Independence Day”, there is a comical seen near the end where actor Jeff Goldblum, playing computer expert David Levinson, writes a virus on his Macintosh PowerBook that disables an entire fleet of technologically advanced alien spaceships. The PowerBook 5300 used in the movie had 8 MB of RAM. How could this be?

Putting aside Apple paying for product placement, we’re not going to stop advanced alien life who are apparently Mac-compatible.

I cite the ridiculous Independence Day ending because I was recently reading through a number of IoT security stories and began thinking about the implications of connecting all these things to the network. How much computing power does one actually need to hack something of significance? Could a 1997 IBM Thinkpad running Windows 95 take down the power grid in the eastern United States? Far fetching, yes, but not ridiculous.

Car hacks seem to be in the news recently. Recall last month’s Jeep hack and hijack. Yesterday, stories came out about hackers using small black dongles connected to a Corvette’s diagnostic ports to control many parts of the car through, wait for it, text messages!

Beyond cars and numerous other consumer devices, IoT security has to reach hospitals, intelligent buildings, power grids, airlines, oil and gas exploration as well as every industry listed in the IRS tax code.

IBM’s X-Force Threat Intelligence Quarterly, 4Q 2014 notes that IoT will drag in its wake a host of unknown security threats. Even IBM, a powerful force in driving IoT forward, says that their model for IoT security is still a work in progress since IoT, as a whole, is still evolving. They do suggest however five security building blocks: secure operating systems, unique identifiers for each device, data privacy protection, strong application security, strong authentication and access control.

In the end, it will be up to manufacturers to build security from the ground up and continual work with the industry to make everything more secure. As we coalesce around an ever evolving threat landscape, it will be the responsibility of smaller manufacturers, giants like IBM and industry organizations like the Industrial Internet Consortium and Online Trust Alliance’s IoT Trust Framework to help prevent the ridiculous from happening.

Do You Believe the Hype?

Posted by David Oro on August 10, 2015 at 22:07 in Platforms

I’m guilty of hype.

As a communications consultant toiling away at public relations, media relations and corporate communications, I’ve had my fair share of businesses and products that I’ve helped get more attention than it probably deserved. Indeed, when it comes to over-hyping anything, it’s guys like me and my friends in the media who often take it too far.

Recently though, I came across an unlikely source of hype - the McKinsey Global Institute.

In a June 2015 report that I’m now reading, McKinsey states, “The Internet of Things—digitizing the physical world—has received enormous attention. In this research, the McKinsey Global Institute set to look beyond the hype to understand exactly how IoT technology can create real economic value. Our central finding is that the hype may actually understate the full potential of the Internet of Things…” (emphasis is mine).

If McKinsey is hyping something, should we believe it?

Their report, “The Internet of Things: Mapping the Value Beyond the Hype”, does point out that “capturing the maximum benefits will require an understanding of where real value can be created and successfully addressing a set of systems issues, including interoperability.”

I think this is where the race is today - finding the platforms for interoperability, compiling data sources, building security into the system and developing the apps that deliver true value. We have a long way to go, but investment and innovation is only growing.

If done right the hype just may be understated. McKinsey finds that IoT has a total potential economic impact of $3.9 trillion to $11.1 trillion a year by 2025. They state with consumer surplus, this would be equivalent to about 11 percent of the world economy!

Do you believe the hype?

IoT projects analysis on Kickstarter Crowd Funding Platform

Posted by pansop on July 7, 2015 at 5:59

Given all the buzz happening in the market around IoT, We looked at related projects in the crowd funding website Kickstarter.com to see how are IoT projects doing with respect to all the other ones.

We chose projects which have either “IoT” or “Internet of Things” either in their title or description and here are our findings.

The success rate of projects at Kickstarter is around 37.5%, for Technology projects it is 21% which is a lot less than the average success rate of projects. In Spite of this our analysis shows that the success rate of IoT projects is 44%, which is pretty good news. People are realizing the importance of IoT and are willing to fund the related projects.

The projects locations are almost concentrated in US and Europe with a few scattered in Asia and Australia

Because the projects are spread all over the world the goals of money to be raised were also in different currencies so to be able to analyse the monetary part we normalized all the numbers to US dollars.

The total sought out money for all the IoT related projects ( ongoing, successful and failed ) in Kickstarter is around $4.7 million and the actual pledged amount for the projects is around $1.5 million.

If you only consider the projects which have made it the total sought out and pledged amount is approximately $1.2 million. So only 2% of the pledged amount went to the unsuccessful projects which is usually the case with most of the projects on Kickstarter.

The average requested funding for all projects is around $60 thousand while the average funding requested by the successfully funded projects is around $44 thousand. For the failed projects it is $3500.

The top 10 successfully funded projects along with their links are given below

Project	Amount Funded
Nuimo: Seamless Smart Home Interface	$210171
miaLinkup- A NEW DIRECTION IN CONNECTED CAR AND IoT	$177291
Thingsee One: The Smart Developer Device	$103671
The WiPy: The Internet of Things Taken to the Next Level	$75564
Bluz: A cloud-connected, Bluetooth LE development kit	$67542
CONTROLLINO PLC (ARDUINO compatible)	$65829
The AirBoard: Sketch Internet-of-Things, fast!	$65014
KoolThings + Smartphone = New KoolApplications for "IoT"	$56364
HarvestGeek -- Brains for your Garden	$47053
Ardhat: Smart Power, Robotics and IoT for Raspberry Pi	$25526

Big Data, IOT and Security - OH MY!

Posted by Andrey on July 22, 2015 at 19:19 in Security

While we aren’t exactly “following the yellow brick road” these days, you may be feeling a bit like Dorothy from the “Wizard of Oz” when it comes to these topics. No my friend, you aren’t in Kansas anymore! As seem above from Topsy, these three subjects are extremely popular these days and for the last 30 days seem to follow a similar pattern (coincidence?).

The internet of things is not just a buzzword and is no longer a dream, with sensors abound. The world is on its way to become totally connected, although it will take time to work out a few kinks here and there (with a great foundation, you create a great product; this foundation is what will take the most time). Your appliances will talk to you in your “smart house” and your “self-driving car” will take you to your super tech office where you will work with ease thanks to all the wonders of technology. But let’s step back to reality and think, how is all this going to come about, what will we do with all the data collected and how will we protect it?

First thing first is all the sensors have to be put in place, and many questions have to be addressed. Does a door lock by one vendor communicate with a light switch by another vendor, and do you want the thermostat to be part of the conversation and will anyone else be able to see my info or get into my home? http://www.computerworld.com/article/2488872/emerging-technology/explained--the-abcs-of-the-internet-of-things.html

How will all the needed sensors be installed and will there be any “human” interaction? It will take years to put in place all the needed sensors but there are some that are already engaging in the IOT here in the US. Hotels (as an example but not the only one investing in IOT) are using sensors connected to products that they are available for sale in each room, which is great but I recently had an experience with how “people” are the vital part of “IOT” – I went to check out of a popular hotel in Vegas, when I was asked if I drank one of the coffees in the room, I replied, “no, why” and was told that the sensor showed that I had either drank or moved the coffee, the hotel clerk verified that I had “moved” and not “drank” the coffee but without her, I would have been billed and had to refute the charge. Refuting charges are not exactly good for business and customers service having to handle “I didn’t purchase this” disputes 24/7 wouldn’t exactly make anyone’s day, so thank goodness for human interactions right there on the spot.

“The Internet of Things” is not just a US effort - Asia, in my opinion, is far ahead of the US, as far as the internet of things is concerned. If you are waiting in a Korean subway station, commuters can browse and scan the QR codes of products which will later be delivered to their homes. (Source: Tesco) - Transport for London’s central control centers use the aggregated sensor data to deploy maintenance teams, track equipment problems, and monitor goings-on in the massive, sprawling transportation system. Telent’s Steve Pears said in a promotional video for the project that "We wanted to help rail systems like the London Underground modernize the systems that monitor it’s critical assets—everything from escalators to lifts to HVAC control systems to CCTV and communication networks." The new smart system creates a computerized and centralized replacement for a public transportation system that used notebooks and pens in many cases. http://www.fastcolabs.com/3030367/the-london-underground-has-its-own-internet-of-things

But isn't the Internet of Things too expensive to implement? Many IoT devices rely on multiple sensors to monitor the environment around them. The cost of these sensors declined 50% in the past decade, according to Goldman Sachs. We expect prices to continue dropping at a steady rate, leading to an even more cost-effective sensor. http://www.businessinsider.com/four-elements-driving-iot-2014-10

The Internet of Things is not just about gathering of data but also about the analysis and use of data. So all this data generated by the internet of thing, when used correctly, will help us in our everyday life as consumer and help companies keep us safer by predicting and thus avoiding issues that could harm or delay, not to mention the costs that could be reduced from patterns in data for transportation, healthcare, banking, the possibilities are endless.

Let’s talk about security and data breaches – Now you may be thinking I’m in analytics or data science why should I be concerned with security? Let’s take a look at several breaches that have made the headlines lately.

Target recently suffered a massive security breach thanks to attacker infiltrating a third party. http://www.businessweek.com/articles/2014-03-13/target-missed-alarms-in-epic-hack-of-credit-card-data and so did Home depot http://www.usatoday.com/story/money/business/2014/11/06/home-depot-hackers-stolen-data/18613167/ PC world said “Data breach trends for 2015: Credit cards, healthcare records will be vulnerable http://www.pcworld.com/article/2853450/data-breach-trends-for-2015-credit-cards-healthcare-records-will-be-vulnerable.html

Sony was hit by hackers on Nov. 24, resulting in a company wide computer shutdown and the leak of corporate information, including the multimillion-dollar pre-bonus salaries of executives and the Social Security numbers of rank-and-file employees. A group calling itself the Guardians of Peace has taken credit for the attacks. http://www.nytimes.com/2014/12/04/business/sony-pictures-and-fbi-investigating-attack-by-hackers.html?_r=0

http://www.idtheftcenter.org/images/breach/DataBreachReports_2014.pdf

So how do we protect ourselves in a world of BIG DATA and the IOT?
Why should I – as a data scientist or analyst be worried about security, that’s not really part of my job is it? Well if you are a consultant or own your own business it is! Say, you download secure data from your clients and then YOU get hacked, guess who is liable if sensitive information is leaked or gets into the wrong hands? What if you develop a platform where the client’s customers can log in and check their accounts, credit card info and purchase histories are stored on this system, if stolen, it can set you up for a lawsuit. If you are a corporation, you are protected in some extents but what if you operate as a sole proprietor – you could lose your home, company and reputation. Still think security when dealing with big data isn’t important?

Organizations need to get better at protecting themselves and discovering that they’ve been breached plus we, the consultants, need to do a better job of protecting our own data and that means you can’t use password as a password! Let’s not make it easy for the hackers and let’s be sure that when we collect sensitive data and yes, even the data collected from cool technology toys connected to the internet, that we are security minded, meaning check your statements, logs and security messages - verify everything! When building your database, use all the security features available (masking, obfuscation, encryption) so that if someone does gain access, what they steal is NOT usable!

Be safe and enjoy what tech has to offer with peace of mind and at all cost, protect your DATA.

I’ll leave you with a few things to think about:

“Asset management critical to IT security”
"A significant number of the breaches are often caused by vendors but it's only been recently that retailers have started to focus on that," said Holcomb. "It's a fairly new concept for retailers to look outside their walls." (Source: http://www.fierceretail.com/)

“Data Scientist: Owning Up to the Title”
Enter the Data Scientist; a new kind of scientist charged with understanding these new complex systems being generated at scale and translating that understanding into usable tools. Virtually every domain, from particle physics to medicine, now looks at modeling complex data to make our discoveries and produce new value in that field. From traditional sciences to business enterprise, we are realizing that moving from the "oil" to the "car", will require real science to understand these phenomena and solve today's biggest challenges. (Source: http://www.datasciencecentral.com/profiles/blogs/data-scientist-owning-up-to-the-title)

Forget about data (for a bit) what’s your strategic vision to address your market?

Where are the opportunities given global trends and drivers? Where can you carve out new directions based on data assets? What is your secret sauce? What do you personally do on an everyday basis to support that vision? What are your activities? What decisions do you make as a part of those activities? Finally what data do you use to support these decisions?

http://www.datasciencecentral.com/profiles/blogs/top-down-or-bottom-up-5-tips-to-make-the-most-of-your-data-assets

Originally posted on Data Science Central

Follow us @IoTCtrl | Join our Community

Big Data, IoT, Wearables: A Connected World with Intelligence

Posted by Andrey on June 11, 2015 at 21:44

At the CES 2015, I was fascinated by all sorts of possible applications of IoT – socks with sensors, mattresses with sensors, smart watches, smart everything – it seems like a scene in sci-fi movies has just come true. People are eager to learn more about what’s happening around them and now they can.

While I was at there I attended a talk given by David Pogue – he is awesome. He pointed out that the prevalence of smartphone is the key to the realization of the phenomenon called “Quantified Self.” I agreed with him. Smart phones play a vital role as a hub where all our personal data converge and present, seamlessly. The fact that you carry your smartphone around all the time and that the screen size perfectly reveals all the information results in a catalyst for wearable devices, IoT or what we like to call it, Intelligence of Things.

It’s all relevant; Big Data, IoT, Wearable, Cloud Computing… While most data is uploaded to the cloud, the client devices are generally powerful enough that the computing can be decentralized. That said, small data (client side) and big data (server side) form an eco-system where small data triggers the knowledge base cultivated by big data and does the predictive analysis and decision making in a timely manner. Furthermore, your smartphone gathers versatile data and is able to analyze cross-app data to personalize your application settings. For example, what about optimizing navigation based on my physical condition? Or how about suggesting the best route according to my health along with the weather? These individual data records might be small, but collectively they enrich the content of analysis and contribute some amazing value. We at BigObject really appreciate this context of Big Data.

Marc Andreessen once said, “I think we are all underestimating the impact of aggregated big data across many domains of human behavior, surfaced by smartphone apps.” For us here at BigObject, the next big thing in big data is to find out a methodology that can link multiple data sources together and identify the meaningful connections between that data. Most importantly it must be responsive enough to deliver actionable insight and simple enough for people to adopt. That is the key to fulfill a connected world.

Originally posted on Data Science Central

The Internet of Things, Data Science and Big Data

Posted by Andrey on June 11, 2015 at 21:41

The Internet of Things (IOT) will soon produce a massive volume and variety of data at unprecedented velocity. If "Big Data" is the product of the IOT, "Data Science" is it's soul.

.

Let's define our terms:

.

Internet of Things (IOT): equipping all physical and organic things in the world with identifying intelligent devices allowing the near real-time collecting and sharing of data between machines and humans. The IOT era has already begun, albeit in it's first primitive stage.

.

Data Science: the analysis of data creation. May involve machine learning, algorithm design, computer science, modeling, statistics, analytics, math, artificial intelligence and business strategy.

.

Big Data: the collection, storage, analysis and distribution/access of large data sets. Usually includes data sets with sizes beyond the ability of standard software tools to capture, curate, manage, and process the data within a tolerable elapsed time.

.

We are in the pre-industrial age of data technology and science used to process and understand data. Yet the early evidence provides hope that we can manage and extract knowledge and wisdom from this data to improve life, business and public services at many levels.

.

To date, the internet has mostly connected people to information, people to people, and people to business. In the near future, the internet will provide organizations with unprecedented data. The IOT will create an open, global network that connects people, data and machines.

.

Billions of machines, products and things from the physical and organic world will merge with the digital world allowing near real-time connectivity and analysis. Machines and products (and every physical and organic thing) embedded with sensors and software - connected to other machines, networked systems, and to humans - allows us to cheaply and automatically collect and share data, analyze it and find valuable meaning. Machines and products in the future will have the intelligence to deliver the right information to the right people (or other intelligent machines and networks), any time, to any device. When smart machines and products can communicate, they help us and other machines understand so we can make better decisions, act fast, save time and money, and improve products and services.

.

The IOT, Data Science and Big Data will combine to create a revolution in the way organizations use technology and processes to collect, store, analyze and distribute any and all data required to operate optimally, improve products and services, save money and increase revenues. Simply put, welcome to the new information age, where we have the potential to radically improve human life (or create a dystopia - a subject for another time).

.

The IOT will produce gigantic amounts of data. Yet data alone is useless - it needs to be interpreted and turned into information. However, most information has limited value - it needs to be analyzed and turned into knowledge. Knowledge may have varying degrees of value - but it needs specialized manipulation to transform into valuable, actionable insights. Valuable, actionable knowledge has great value for specific domains and actions - yet requires sophisticated, specialized expertise to be transformed into multi-domain, cross-functional wisdom for game changing strategies and durable competitive advantage.

.

Big data may provide the operating system and special tools to get actionable value out of data, but the soul of the data, the knowledge and wisdom, is the bailiwick of the data scientist.

.

See: http://bit.ly/10TgVHG

.

Originally posted on Data Science Central

How Big Data and the Internet of Things Create Smart Cities

Posted by Andrey on June 9, 2015 at 19:59

The Internet of Things may be giving over to the Internet of Everything as more and more uses are dreamed up for the new wave of Smart Cities.

In the Internet of Things, objects have their own IP address, meaning that sensors connected to the web can send data to the cloud on just about anything: how much traffic is rolling through a stoplight, how much water you’re using, or how full a trash dumpster is.

Cities are discovering how they can use these new technologies — and the data they generate — to be more efficient and cost effective in many different ways. And it’s a good thing, too; some estimates suggest that 66 percent of the world’s population will live in urban areas by the year 2050.

These are cutting edge ideas, but here are some of the most fascinating ways Smart Cities are using big data and the Internet of Things to improve quality of life for their residents:

The city of Long Beach, California is using smart water meters to detect illegal watering in real time and have been used to help some homeowners cut their water usage by as much as 80 percent. That’s vital when the state is going through its worst drought in recorded history and the governor has enacted the first-ever state-wide water restrictions.
Los Angeles uses data from magnetic road sensors and traffic cameras to control traffic lights and thus the flow (or congestion) of traffic around the city. The computerized system controls 4,500 traffic signals around the city and has reduced traffic congestion by an estimated 16 percent.
Xcel Energy initiated one of the first ever tests of a “smart grid” in Boulder, Colorado, installing smart meters on customers’ homes that would allow them to log into a website and see their energy usage in real time. The smart grid would also theoretically allow power companies to predict usage in order to plan for future infrastructure needs and prevent brown out scenarios.
A tech startup called Veniam is testing a new way to create mobile wi-fi hotspots all over the city in Porto, Portugal. More than 600 city buses and taxis have been equipped with wifi transmitters, creating the largest free wi-fi hotspot in the world. Veniam sells the routers and service to the city, which in turn provides the wi-fi free to citizens, like a public utility. In exchange, the city gets an enormous amount of data — with the idea being that the data can be used to offset the cost of the wi-fi in other areas. For example, in Porto, sensors tell the city’s waste management department when dumpsters are full, so they don’t waste time, man hours, or fuel emptying containers that are only partly full.
New York City is creating the world’s first “quantified community” where nearly everything about the environment and residents will be tracked. The community will be able to monitor pedestrian traffic flow, how much of the solid waste collected is recyclable or food waste, and air quality. The project will even collect data on residents’ health and activity levels through an opt-in mobile app.
Songdo, South Korea has been conceived and built as the ultimate Smart City — a city of the future. Trash collection in the city is completely automated, through pipes connected to every building. The solid waste is sorted then recycled, buried, or burned for fuel. The city is partnering with Cisco to test other technologies, including home appliances and utilities controlled by your smartphone, and even a tracking system for children (using microchips implanted in bracelets).

This is just the beginning of the integration of big data and the Internet of Things into daily life, but it is by no means the end. As our cities get smarter and begin collecting and sending more and more data, new uses will emerge that may revolutionize the way we live in urban areas.

Of course, more technology can also mean more opportunities for hackers and terrorists. (Anyone see Die Hard 4, where terrorists hacked the traffic control systems in Washington, D.C.?) The threat that a hacker could shut down a city’s power grid, traffic system, or water supply is real — mostly because the technology is so new that cities and providers are not taking the necessary steps to protect themselves.

Still, it would seem that the benefits will outweigh the risks with these new data-driven technologies for cities, so long as the municipalities are paying attention to security and protecting their assets and their customers.

What’s your opinion? Are you for or against more integrated technologies in cities? I’d love to hear your thoughts in the comments below.

I hope you found this post interesting. I am always keen to hear your views on the topic and invite you to comment with any thoughts you might have.

About : Bernard Marr is a globally recognized expert in analytics and big data. He helps companies manage, measure, analyze and improve performance using data.

His new book is: Big Data: Using Smart Big Data, Analytics and Metrics To Make Better Decisions and Improve Performance You can read a free sample chapter here.

Originally posted on Data Science Central

Cloud integration: Making decisions at the Edge

Libraries

APIs

Specialized hardware

Ubuntu Core and IFTTT spark

Guest blog post by ajit jaokar

By Ajit Jaokar @ajitjaokar Please connect with me if you want to stay in touch on linkedin and for future updates

Background and Abstract

Deep Learning

How does a Computer Learn?

How do Deep learning algorithms learn?

How does Deep Learning help to solve the intuitive problem

An example of learning through layers

Implications for IoT

Complementing Deep Learning algorithms with IoT datasets

Implications for Smart cities

Conclusions

Let’s talk about security and data breaches – Now you may be thinking I’m in analytics or data science why should I be concerned with security? Let’s take a look at several breaches that have made the headlines lately.

Be safe and enjoy what tech has to offer with peace of mind and at all cost, protect your DATA.

Forget about data (for a bit) what’s your strategic vision to address your market?

Note: this page contains paid content.

IoT Central