With Proposed SEBI Regulations, Platforms Must Use AI to Detect Securities content, But Will It Work? #Nama

Explainer Briefly Slides In October this year , the Securities and Exchange Board of India (SEBI) came out with a consultation paper that instructs platforms about how they can prevent fraud, impersonation, claims by unregistered entities, and the presence of unregistered entities by relying on AI/ML-based solutions. Platforms also have to identify content (including advertisements) that is related to the securities market, created by SEBI-registered entities or their agents, provides education in the securities market or directs people to other mediums such as Telegram, WhatsApp, Phone, e-mail etc. They must implement said solutions to become a specified digital platform (SDP) under the regulator.

One of the key questions emerging from this regulation is whether AI/ML-based solutions can effectively identify fraud/misinformation pertaining to the securities market. Speaking at MediaNama’s recent discussion about SEBI regulations, Vakasha Sachdev, Senior Manager, Government Affairs and Communications at Logically said that while it is possible to track misinformation and disinformation using AI, the fact that SEBI is asking platforms to identify all sorts of other securities related content could prove to be challenging. How AI solutions identify misinformation: “We did a bit of research on this at Logically, we were trying to understand the scope of financial misinformation, disinformation in India.

What we found is that, yes, there are a lot of patterns that these people follow. But now this is the interesting thing. What we looked at were the cases where this was clearly attempts to mislead people, clear attempts to deceive people, the really fraudulent activity.

Now, for that stuff, you can track it in different ways,” he said, adding that the solution doesn’t even have to look at the content itself but rather patterns of behavior to identify such individuals. Then the platform can have an expert in place to look at content. SEBI’s framework asks platforms to go beyond this, he said explaining that while people could build models to identify all sorts of securities content.

“You could build a model that would look at securities content. But the problem is in terms of that model being able to accurately find this distinction between education, [advisory] and what a regular person can say about this securities market. That, I think, will be a little bit difficult,” Sachdev added.

He mentioned that AI models would only be able to provide a probabilistic rating of whether something is financial advisories or educational content and relying on the AI’s rating 100% could be dangerous. Similarly, within advertising content as well, AI can only access the distinctions between advisory and education with human intervention, Sachdev said. “You’re going to have people employed by the platforms, who are like SEBI experts, regulation experts, to manage this process.

They’ll have to then help build the algorithms that the platforms are going to use to track this stuff,” he explained. What kind of financial misinformation do platforms currently have? Sachdev explained that Logically has seen coordinated behavior to mislead people around Meta advertisements. He mentioned that these bad actors make accounts and groups with very minimal followers.

“They’ll put a post which will be talking about, ‘Okay, you can use Bitcoin to invest in this and you can use that’. They put out a case for what they’re basically suggesting you can do. Then they refer you to a closed room,” he said, adding that the majority of the deceptive practices actually happen on encrypted channels on platforms like Telegram.

However, to get people to those encrypted channels, these bad actors will first lure them in with content elsewhere. “You’ll see advertisements, bought advertisements on Meta, which you’ll see in the ad library, and they’ll be then targeted at a small town. It’ll be targeted in a particular area where they know there isn’t that understanding of retail investment,” Sachdev mentioned.

To stop these bad actors from defrauding people, one has to track suspicious ads from these accounts with limited followers who make references to the securities market in tier 2 and tier 3 cities, he said. MediaNama’s editor Nikhil Pahwa suggested that legitimate social media influencers also use advertising and other methods to influence people. To this Sachdev responded that platforms can tell apart legitimate and deceptive influence operations because the latter uses tactics like bots and referrals to other platforms.

When asked how equipped platforms currently are to handle this deceptive content, Sachdev explained that with Meta, fact checkers signed up with Meta’s third party fact checker program (3PFC) see all the content that uses flag as misinformation and also content flagged by Meta’s algorithms. Can AI detect coded language? Pahwa mentioned that in the early years of the internet, people avoided platform censorship using leet speak, where people use numbers or special characters to replace letters in a word. So for instance, a word like “need” would become something like “n33d”.

“I think in a sense, the AI/ML is one of the ways in which you can address that problem. Because earlier you were relying on a non-adaptive system that can only look at, specific words it’s specific phrases, and then you moderate based on that,” Sachdev said, adding that with AI, you can build the model to encompass words which use alternative languages. “The problem is, of course, it will keep changing.

There are people who will find newer and newer ways to get around this,” he explained. He said that this can make it harder for platforms to take action against content within SEBI’s prescribed timeline which requires platform to identify content within 24 hours and block/take down problematic content within 72 hours. For ads, SEBI requires platforms to identify problematic content within 24 hours and block it before it goes live.

How effectively do AI models track URLs? Sachdev said that ideally, companies should be able to build their AI systems to be able to go through short links and identify what the link says. “It is a little bit spotty. Sometimes it will work, sometimes it won’t.

[However] you can I think, build that capability for sure. But again, there will be challenges with that. I think in terms of the false positives, I think, you are in a situation where all your tech on this will only be able to really give you a probabilistic rating.

The wider you make the net, the more false positives you’re going to get out of it,” he pointed out. To this, Pahwa asked him whether the model could identify links in screenshots or images. “I can create a short URL or a t.

me link in an image and upload it. When it’s a mix of text and image, where there might be a URL and an image, how feasible is it to be able to screen for that as well?” he questioned. Sachdev responded that companies could build OCR (optical character recognition) into their AI models, which should be able to scan content.

Mix of image and text can be difficult to identify: However, the fact that the content Pahwa suggested would be a mix of image and text could make it trickier to tackle, he said. “We’ve seen some amount of success with being able to do OCR reading of text on images. That can be done, and that can be done on an AI/ML system as well.

But I mean, it is a big challenge. It’s again going to be it will raise so many issues with you have people trying to finding more and more ways to get around those things,” he said, emphasising that platforms are better of trying to identify patterns of deceptive behavior rather than looking at specific posts. “If someone is just posting images of particular kinds, and then you then cross-link that with their other posts and the people who they’re sharing and the people who they’re connected with, you could come up with a TTP framework— a Techniques Tactics Procedures framework, which you could try to identify using an AI system,” he said.

However, you would have to spend a lot of time training these systems, and not all platforms would be able to do that in-house, he explained. Platform-specific challenges in implementing AI/ML detection: Tamoghna Goswami, Head of Policy at ShareChat mentioned that the way SEBI defines advertising in its consultation paper is fairly broad. “How the word advertisement is defined here is that that is intended to promote the sale and in addition, can include a content that is identified or classified,” she said, adding that this definition could also cover self-declared advertisements by influencers.

She questioned that it was unclear how a platform would be able to identify such a self-declared ad. Goswami pointed out that as per the consultation paper, SEBI prescribes methods for how platforms need to identify content, stating that this may not work well with a technological solution like AI/ML..