When a company receives an online order from a customer with multiple items across different SKUs, it then has to decide the best way to fulfill the order and minimize split shipments. A recent paper from researchers in China looked at using a clustering algorithm with product categories in an attempt to reduce spit shipments.
An order is considered split when it contains two or more items that are stored in different warehousing locations, resulting in multiple shipments. For the research, they worked with "one of the biggest online supermarkets in China" and used data from the company over a six-day period. Over that time, the majority of the company's daily orders included multiple items. And of the multi-item orders, 76% were in multiple categories.
Their goal was to figure out which warehouses should store which categories to ensure that orders were split as little as possible.
At an online supermarket in China, orders are primarily multi-item
Manjeet Singh, the director of global operations science and analytics at DHL, said a company can reduce the number of packages and overall transportation cost per order if it can reduce split shipments.
"That's why companies would like to have as many items of product category, which generally are ordered together, in one location," said Singh, who was not involved in the research.
The paper also notes that more split shipments increase environmental pollution due to the uptick in packaging.
The researchers formulated a model that identifies products by SKU, but organized products by category. The goal was to create an algorithm that would help them to allocate these categories across the available warehouses in a way that would reduce split shipments.
The algorithm worked to reduce what they referred to as "out-links," which is when two categories in one order are in different warehouses.
Clustering the inventory based on categories rather than SKUs allowed it to be an easier problem to solve mathematically. But it also makes sense from a warehouse operations perspective, Singh said.
"You don't want different types of items which need a lot of material handling or different kinds of handling in the same warehouse, generally speaking," he said. "They want similar kind of items in the warehouse because your warehousing costs will go down."
He said this is why Amazon usually stores inventory based on size.
The researchers found that the algorithm they developed was able to cut down on out-links compared to previous methods.
"It is a significant improvement over the real category distribution in practice," the paper reads. "It means that if the online retailer uses our best category distribution, their number of split orders will decrease significantly."
Singh said the researchers developed an "interesting model" and that the ability to allow a company to make its own categories can definitely simplify the process of clustering. But he underscored that it requires the person making the categories to understand the business, products and geography. And it is in line with work that operations researchers have been doing for a while.
"It's nothing revolutionary ... it's something that we've been doing here, we haven't used the model like this, but we've been using techniques [to solve the problem] in a similar fashion," he said.
The use of clustering as an analytical tool has found a number of uses within warehouses in recent years. Clustering is the ability to determine if observations, such as orders or product categories, fall into distinct groups.
"Identifying such groups can be of interest because it might be that the groups differ with respect to some property of interest, such as spending habits," reads the latest edition of "An Introduction to Statistical Learning."
Singh used clustering in his research that helped DHL optimize packaging sizes at warehouses and cut shipping costs. It's a technique that can not just be used to determine which warehouse to store inventory, but what aisle to store the inventory.
He noted that an interesting next step for this research would be using machine learning to generate the categories rather than relying on a human to generate the groups.
"If you happen to have a lot of SKUs, a lot of items and you don't want to leave it to one individual ... it might be worthwhile to use a machine learning technique [to create the categories]," he said.
The researchers had their own ideas for next steps: allowing a category to be held in multiple warehouses. Their current model assumes that one category of products is just held in one warehouse. But allowing the faster-moving items to be stored in multiple warehouses could cut down in splits even more, they suggest.
This story was first published in our weekly newsletter, Supply Chain Dive: Operations. Sign up here.