“Data scraping” concerns some Ontario cannabis retailers

| David Brown

Some Ontario retailers say they are concerned about a new type of “data leak” circulating within the industry that appears to be showing their sales. 

Unlike a leak of retailer info in 2022 that was the result of an issue with a third-party partner of the Ontario government, this newest release of information seems to be the result of third-party data “scraping” services that collate publicly available information.

Jennawae McLean, the co-founder of Calyx+Trichomes, a cannabis store in Kingston, Ontario, says she became aware of sales information in early February showing figures for stores in the Hamilton area and was concerned it could be even more widespread. 

Posting about it on Twitter, she began hearing from other retailers still reeling from last year’s data leak. 

Scraping is kind of a dirty word in our industry as it diminishes the complexity, but ultimately what we do is go to retail web pages, pull relevant, public-facing data twice a day, clean and organize it for LPs so they can monitor their products in the market.

Jeff Woods, Neobi Technologies

“We’re still not sure exactly what happened,” explains McLean, noting that in her discussion with other store owners, not all sales data appeared entirely accurate.

 The Ontario Cannabis Store (OCS) and the Alcohol and Gaming Commission of Ontario (AGCO) both sent out notices to retailers that the leak did not originate from either of their organizations. 

Owen Allerton, the owner of Highland Cannabis, a retail store in Kitchener, says he suspects the issue comes down to “data scraping” services that collect and collate information from websites like his own in order to glean information about product levels and sales figures. 

The way his own store’s online menu works, he explains, allows someone to select a product to purchase and will tell the user if the amount they selected is more than the store has in stock. By using an automated search mechanism these services can complete enough searches to gain a rough idea over time of what a store’s sales figures could look like. 

“Because it’s linked to inventory…you can inadvertently see your inventory through that dropdown mechanism. So they go through every SKU on a website like mine and take snapshots of what the max quantity is at different points of time and, over time, they’ll see what the sales are.”

His concern, he says—something echoed by McLean—is that this can be a security risk for store owners already facing concerns with break-ins and robberies, as well as giving inside information to retailer competitors in Ontario’s highly saturated and competitive retail cannabis market. 

For Allerton, he says the responsibility comes down to the companies that are managing retailers’ online sales platforms. 

“On one hand, what products I carry and the prices of those products, that’s fair game. It’s on my website. But there’s a line there somewhere when you start to extrapolate my sales from all of this, and I think that, if it is from scraping, it falls back to the e-commerce providers. A lot of us  are talking to our e-com providers and telling them they need to do something about this.”

For their part, Dutchie, the company that operates Highland’s online store, says they have heard concerns about sales information making its way to the public.

“While we are aware of ongoing discussions of data leaks in the Canadian market, we are confident that our customers’ data and our platform remain secure and protected,” Dutchie’s chief technology officer, Chris Ostrowski, tells StratCann. “We are committed to data privacy and place customer trust above all.”

“Protecting our customer’s data is a top priority. As part of our ongoing commitment to doing so, Dutchie recently earned a SOC 2 compliance certification to help keep customer data safe.”

Jeff Woods, the co-founder of Neobi Technologies, a service that utilizes such “scraping” tools to gather information for licensed producers and others within the industry, says the process is somewhat misunderstood. 

“Scraping is kind of a dirty word in our industry as it diminishes the complexity, but ultimately what we do is go to retail web pages, pull relevant, public-facing data twice a day, clean and organize it for LPs so they can monitor their products in the market,” he told StratCann.

The main reason they provide this service, he explains, is for cannabis producers who otherwise don’t have much insight into factors such as what stores carry their products or when and where certain products are or are not selling. 

Rather than exploiting the system, he says Neobi and other similar services are filling a gap in the market, arguing that many provinces don’t provide distribution or inventory data to producers.

“It’s a black hole of data in the supply and demand chain. LPs have no real-time understanding of where their products are and how quickly they’re being sold.”

“This is a consumer packaged goods industry,” he continues, “and inventory and distribution data are the basic information LPs need to operate their business effectively. We aim to ensure our partners have access to the intelligence they need to thrive in a heavily regulated industry.”

 “We don’t collect or produce sales data; POS companies sell that information. We track product inventory counts and how they deplete over time, helping producers understand inventory velocity in specific markets or stores.”

“Regarding the reports circulated on Twitter, we can confirm Neobi does not handle proprietary data and only aggregates publicly available information. From what we can tell, this is not data produced by our team.”

For retailers like Allerton at Highland who aren’t comfortable with that kind of information being available, he says he may start looking for an e-commerce provider who can address his concerns. 

“If they’re able to scrape that data, it’s a problem for the e-commerce providers we’re all using. If they can’t plug this hole, then we’ll need to look at viable alternatives.”


Like the work we do at StratCann, and want to support independent media?