Can Robots and Humans Work Together?

Welcome to the Amazon Dark Fulfillment Center

A few years ago Amazon calculated how many people it will need if it continues to grow its business and fulfillment centers (FC). They found they will need to hire more than five million people to work in the FCs. If you ever visited any of these massive facilities, you understand the scale of this operation. Some of these FC can handle a million of orders a day. Limited scalability is a real issue for a company that wants to grow and can’t scale a critical part of its business efficiently. Amazon started to work on building the “Dark Fulfillment Center.” In this vision, Amazon can save huge costs if it can automate its FC operation completely. From self-driving trucks on the inbound of the products, robots distributing them, other robots picking them up, putting them in packages, and self-driving trucks on the outbound of the packages on their ways to the ordering customers. They can also save a lot on the energy that is needed to light up the FC or the air-conditioning for the workers. Other savings can come from reduced “shrinkage” (items disappearing into workers’ pockets and bags…), and no need to train, feed, commute or heal them all the time.

However, this vision will not happen in the foreseeable future. Amazon will continue to invest in robots for its FC, but at the same time, it will continue to hire, train, and employ humans to work there. Amazon has more than 100,000 robots already in the FC, but also more than 300,000 people there. It continues to grow its human workers; with its business growth, but not in the unreasonable and not scalable pace it had to do if it didn’t build these robots.

The majority of the robots are the Kiva drivers robots that are picking up the heavy pods and lining them up for the human pickers to take an item from the pods and place it in the bin for packing. The reason for putting most of the initial efforts into the drivers is that it was the most problematic issues in the FC. Before the driver robots, the people had to walk the floor and find the pods and the items. It took on average 20 minutes of human efforts to fulfill a single order, mainly due to the long walks to and from the pods. Today with the robots, the time needed had dropped to less than a minute. The pickers are standing in their station, and the robots are bringing the pods to them and then taking them back away. The same people that before were lazy (walking for 8 or 10 hours is tough), expensive, and nonproductive are now 20 to 30 times more productive. The working environment is much better for the human, even if they are in a sort of cages, to prevent them from stepping onto the floor where the robots are driving. They won’t be run over, as you might suspect, but the robots will have to stop, and it will cause delays in shipments and deliveries. Thanks to the collaboration between people and machines, substantial fulfillment centers on peak days (Prime day, Black Friday or Cyber Monday) can now handle a million customer orders a day.

So, Robots are better than humans?

Many CEO and CIO are dreaming of building automation into their business. People are making mistakes, they are not scalable, they are not always working, and automating their jobs can give a dramatic boost to their business. At the same time, some employees feel that they will be replaced soon by these robots. Truck drivers or Uber drivers will be replaced by self-driving cars, call centers receiving or making calls will be replaced by chat-bots, robots will replace factory jobs, etc.

These thoughts are fueled by technology demonstrations by the big companies such as IBM Watson, who won the Jeopardy game show, Google Duplex, who (or should I say, that) can call businesses and carry a human-like conversation, machines winning the most complex human games like Chess and Go and others. Add movies such as Terminator and speeches or tweets from celebrities such as Elon Musk, and it seems that the robots are taking over our jobs.

I also heard it often from the scientists that are working in the AI domain. They are talking about “human-level performance”, and publishing many papers that are showing how their models are better than humans on some tasks (for example, “Surpassing Human-Level Performance on ImageNet Classification”). We also argue about the need to build machine translation management tool, that can be used by human translators to improve the automatic machine translation of their systems. “In a few years machine translation will be better than the human translation,” they were saying.

The reality for today and the next few years at least is different. Machine learning systems are mostly single-task-model. Watson can win Jeopardy, but it can’t perform any other intelligent task like playing a tic-tac-toe game. Google Duplex can make a restaurant reservation, but it can’t make a conversation on any different topic. Researchers are investigating methods for general-purpose-AI, or simpler multi-task models. Nevertheless, the success in this direction is limited.

Integrate people with machine

The successful implementation, until we find how to build general-purpose-AI, is to integrate together people, who are generating data and annotating it, and machine learning models, that are scalable and mostly automated.

“Nobody will be fired,” “Giving your people super-powers,” “Augmentation instead of Automation” are all simple slogans that you can put on a banner to guide your next machine learning project and team. It can direct you to where to invest your efforts, and what to expect from different parts of the system. If you try to exclude the humans from the system entirely, you will miss their annotated data contribution, their ability to step in when the system is confused, and you have a way to bridge the gap until the automated system is good enough. Let’s examine an interesting example:

Amazon Style Check

Fashion is one of the retail categories that Amazon is not the leader at. There are many reasons for that, mainly the need to try out the clothes for size and style. Style Check is among Amazon’s many experiments in this domain, such as Amazon Wardrobe that was opened for the public recently, and the Echo Look, that was also GA recently. The idea behind the service is that people will take a couple of selfies with different outfits and the service will advise which one is better. The service is designed to be immediate, and it forced Amazon to have fashion experts ready across the globe (Seattle on the West coast of USA and Tel Aviv in the Middle-East 10 hours timezone apart, for example), which is complex and expensive. However, even the big Amazon with extensive experience with building machine learning systems is forced to have real people ready to answer customers queries 24/7. The team that is building this service in Amazon is also training machine learning models on the historical data, and also show the system prediction to the fashion experts, but it still uses people and will continue to use them for a long time. The machine learning models can be faster, and also “remember” better previous feedbacks from the users on the fashion advises (favorite color, for example), and in time with more data will become even better. However, still many times that confidence of the models is too low or the human and the machine disagree (like human experts can disagree sometimes) and the need for the human eyes is valid for a long time. Ideally, the usage of the service will grow, but at the same time the automatic capabilities of the models will increase, and therefore the size of the human exerts team can remain mostly constant. “Nobody will be fired.” The same human experts team will be able to support the exponential growth of usage without the need to grow exponentially as well.


We discussed a couple of examples from Amazon experience with building machine learning systems to grow its business (fashion category) and operation (robots in the fulfillment centers), and we focused on the critical balance between the role of humans and the role of the machines in such a system. It is easy to put too much on human intelligence or to expect too much of artificial intelligence, but these systems will fail in the short or long terms. The combination and integration of these two types of intelligence have proven many times as the only scalable and long-term model.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store



Guy Ernest is the co-founder and CTO of @aiOla, a promising AI startup that closes the loop between knowledge, people & systems, making sure nothing is lost.