How Computer Vision is Automating Package Handling at Multifamily Residential Buildings

Computer Vision-based imaging has become a powerful tool by actually recognizing the context of everything around a particular object being monitored. Why is this such a boon? As Mo Cheema, Director of Solutions Design and Implementation at Position Imaging puts it, “Technology is best utilized when eliminating boring, repetitive jobs that humans don’t enjoy doing anyway.” That’s why companies are using Computer Vision to automate mundane tasks like sorting green apples from red apples or separating out recyclable items from the trash. Another extremely important example is secure package management applications that help multifamily property managers automate the reception and distribution process. And if a resident accidentally picks up the wrong package, Computer Vision’s built-in audio guidance alerts the resident immediately to put it back. 

It’s a timely and interesting trend that AIThority requested Mo to expand upon for their readers.  Learn how Computer-Vision-based imaging is literally changing how deliveries are handled at multifamily residences in this eye-opening read originally posted in AIThority.


Microsoft’s Azure Cognitive Services made news recently when it announced an innovative new service that allows developers to automatically generate captions for images. This latest addition to the cognitive intelligence system that leverages Computer Vision technology can reportedly generate image captions that are, in many cases, better or more accurate descriptions than what humans write. An image caption generated by a machine for a machine, will be much more effective and make your Bing or Google search results much more relevant. This can help drive organic traffic to your webpage.

Computer Vision-based image captioning is a big milestone because it means the AI systems are beginning to detect, understand and describe an action or motion within the context of everything else around it. It leverages deep learning to detect what the item is, the action it is performing and then uses Natural Language Generation (NLG) to describe it. Imagine what this technology could do for the blind or visually impaired; they could see through the eyes of a computer. For example, a blind or visually impaired person walking down the sidewalk in an urban location, could detect an approaching intersection and be notified of the remaining distance, so they can stop just in time to avoid stepping into oncoming traffic.

The Phenomenon of Computer Vision

Computer Vision replicates the “visual” intelligence of the human brain. Humans perceive up to 80% of all impressions by means of sight and 30% of the human cortex is devoted to vision, in comparison only 8% for touch and 3% for hearing. Just like vision is the most important human sense, it is also critical for computers to gain a more robust understanding of the environment. If they can learn to understand an action in the context of everything else around it, they can become that much more intelligent.

The modern Computer Vision actually relies on deep learning algorithms a.k.a. neural networks, to understand the objects it’s witnessing. These neural networks use massive amounts of visual data to learn and find patterns to arrive at a highly educated conjecture about what a certain object actually is. These algorithms are inspired by the human understanding of how brains function, in particular, the interconnections between the neurons in the cerebral cortex.

Although Computer Vision, as a subsegment of Artificial Intelligence, has been around since the 1960s, there have been some recent breakthroughs that have led to increased adoption of this technology. Also, the increased processing power from microchip producers such as NVIDIA and Intel has played a big role in this phenomenon. As we continue to harness additional processing power and improve technologies that Computer Vision relies on, such as the rollout of the 5G internet service, we are likely to see widespread adoption and an increased rate of automation. The Fourth Industrial Revolution is, indeed,  underway.

Computer Vision Applications

Computer Vision is advanced enough to be applied in many areas already and there are open-source software solutions available in hopes that the public would use them to innovate and drive adoption in this somewhat young field.

There are many Computer Vision applications businesses are leveraging to automate or streamline processes. In the Healthcare field, Computer Vision can detect cancer from CT scans better than doctors. In highly secure environments, retinal and fingerprint scanning can uniquely identify individuals to enable or restrict access. Wind turbines may be inspected for defects via autonomous drone footage with high-definition mounted cameras. In addition, billions of dollars are currently being invested in autonomous transportation where Computer Vision is playing a big role to guide vehicles by identifying obstacles, people, and road signs along the way.

Automating Package Handling with Computer Vision

Technology is best utilized when eliminating boring, repetitive jobs that humans don’t enjoy doing anyway. That’s why companies are using Computer Vision to automate mundane tasks like sorting green apples from red apples, separating out recyclable items from the trash, etc. This is helping automate the supply chain and redirecting the humans to perform more complicated tasks where they apply their human intelligence to solve problems that have never been solved before. A good example is Position Imaging where the company is applying its Amoeba Computer Vision technology to help multifamily property managers automate the package handling process and redirect the staff to manage residents, rather than packages. It provides an enhanced experience for the residents because they no longer have to wait in line or contact staff to pick-up their package.

Couriers deliver packages directly to the Smart Package Room where the Amoeba Computer Vision technology virtually tags and monitors the location of each package, essentially keeping eyes on the packages 24/7. It then locates the package, which used to be the property manager’s task, and guides the resident to their package when they come to pick it up. With its advanced surveillance capabilities, if the resident accidentally picks up the wrong package, built-in audio guidance alerts the resident immediately. The Smart Package Room is smart in the sense that it can see and make sense of actions that allow it to track the location of items in 3D space. It doesn’t just produce a digital map of the room, but rather a 3D replica of the shelves with a 3D coordinate system of where each package is located in the room.

Conclusion

With all these advancements in Computer Vision, we have only just scratched the surface. The possibilities are endless, and the future is bright. The use of Computer Vision technology will soon be widespread and access to information will become that much easier. The computation will be done by machines faster than ever before and they will help us make better and faster decisions. Most importantly, these advancements will help automate mundane tasks like locating a package. As a Forbes Contributor, Rob Toews, recently said, “A wave of billion-dollar Computer Vision startups is coming.”

Recent Posts