You can build AI

Two eyes are better than one

8 min readNov 26, 2021


Most computer vision systems focus on the analysis of a single camera. However, sometimes, it is not enough and you need more eyes. This is the description of a real-life project that uses AWS Panorama to determine the exact location of people on a factory floor by aligning the output of the AI model on multiple cameras streams.

Artificial Intelligence (AI) projects in traditional facilities

AI can be a powerful tool to solve many business problems. However, it is not easy to use AI in real-life cases due to physical or economic constraints. In this post, I show how to use the recently launched AWS Panorama device to improve the accuracy of people's location on an arbitrary factory floor, using multiple cameras streams and a pre-trained computer vision inference model. The solution is low code and low cost and fits any industrial environment.

There is a significant discrepancy between the maturity of AI technology and its implementation by digital-native companies such as Amazon or Google and the more traditional companies. Amazon can build magical services such as Amazon Go or Alexa, and Google can build Google Assitant or Style Detection (OK, the second one is only an April fools joke). However, when traditional enterprises are trying to create similar magic, it doesn't go so well.

Can we democratize AI and give every company access to this powerful technology? Can Hilton fight back with AI against the competition from newly created Airbnb? Can AIG harness AI to better compete with digitally native competitors such as Lemonade or Hippo?

I believe we can.

The challenge

In perfect timing, AWS released the new version of the Panorama device, and my company, Aiola, started a large-scale implementation of AI in a big food company for food safety and quality assurance (FSQA). One of the difficulties in this implementation is to know where exactly a problem was reported. Should we use a barcode to scan before a report? Should we stick NFC tokens around the facility and use a mobile device to tap them? These are complex solutions in a messy real-life environment with cleaning chemicals and similar constraints.

Hey, don't we have powerful computer vision models that can identify where people are standing, similar to self-driving cars' technology to identify…




Guy Ernest is the co-founder and CTO of @aiOla, a promising AI startup that closes the loop between knowledge, people & systems. He is also an AWS ML Hero.