How this startup is changing regional news sharing in India through Tech
Lokal is an Indian hyperlocal content app catering to India’s non-English speakers. The app is available in 8+ regional languages. In an exclusive interview with Analytics India Magazine, co-founder Vipul Chaudhary spoke about Lokal’s services, tech stack and the language corpus.
AIM: What’s Lokal trying to solve?
Vipul Chaudhary: People staying in smaller towns have a different type of media consumption than us. In small towns, people are more interested in what is happening around them because that affects their work and livelihoods. But, small, regional newspapers were catering to them and were not doing a great job. The papers were running mostly on government ads with no incentive to do a great job. So, we had found a gap. Also, the job portals and matrimonial platforms had hardly any relevant matches for the region and were forced to move to the state capitals. So, there was also this information gap. We wanted to bridge it and started Lokal as an information marketplace for local updates, real estate listings, job searches, etc.
AIM: Tell us about Lokal’s offerings.
Vipul Chaudhary: Our core offering is relevant local updates. We also provide information on prices of vegetables, gold, silver, petrol, diesel etc. Additionally, the platform also has intel about local health services and utilities. On top of this, we provide five classified services: Jobs, matrimonial, real estate, ‘buy and sell’ and wishes. Wishes are unique to our platform, similar to the wishes we see in newspapers.
We offer our services in eight languages, including Punjabi, Telugu, Tamil, Kannada, Malayalam, Marathi, Gujarati and Bengali.
AIM: Which Tier accounts for your core user base?
Vipul Chaudhary: Our core user base is customers in Tier 2 or 3 cities. We also have users from Hyderabad, Bangalore and Mumbai. We have 20-22% penetration in a district, but it’s less in cities like Bangalore and Mumbai. While we are not focusing on it, we believe the top Tier can also benefit from what Lokal offers in terms of local information.
AIM: Tell us about your language corpus.
Vipul Chaudhary: Our regional language corpus comprises the information people have posted on our platform in their regional languages. Our focus is on how we use it. We are trying to take the information that comes to us on the platform and analyse and understand it through NLU. That’s the input part. After understanding the language, we have to understand the users as well. We tag their interests based on what they are saying. Now, we have content understanding and user profiling; we take these two parts, and then our job is to match them with each other.
AIM: Indic datasets are one of the biggest focus areas in AI. How does Lokal collect regional data?
Vipul Chaudhary: The existing libraries and data are usually Wikipedia for NLP, or they have a corpus of existing data scraped from regional newspapers. Lokal generates 10X more content compared to local newspapers. Our content talks about what is happening in the area, different subjects and people. For local information, we will ask where it happened, when it happened, and what happened? Hence we have a better-tagged corpus by default. If your input is better, your output will be better. We have a very clean, filtered, tagged input and a huge corpus on top of it. So that helps us build much better accurate models.
AIM: Tell us about Lokal’s tech stack
Vipul Chaudhary: Our tech stack has evolved rapidly. When I first built it, I used Python Django as my starting point because it could quickly get us up and running.
Even today, the core is Python Django. Our website is based on Node.js and React.js. Our Android app is based on native Android Kotlin. We have an analytics system that we’ve built in-house based on Red Panda and Apache with Kafka cues in the middle. We also have a lot of transcoding being done for videos right now. Our video transcode maker turns videos into streamable content because those are static videos. This service is also being run on FFmpeg Codex, and for tags, we use Python.
AIM: Considering the magnitude of user data Lokal collects, how do you ensure data security?
Vipul Chaudhary: We use the best standards to ensure data privacy. All our servers are in India. We anonymise the user data, and we analyse it in buckets and chunks. The machine would be doing a lot of analysis. We have completely isolated our production environment and our development environment on our cloud systems. No developer has direct access to our production environment.
All our data is in a virtual private cloud to prevent outsider hacking. Since the data itself is anonymised and isolated between environments, even if one can hack into one system, they will not be able to extract personally identifiable information. Also, development in cloud safety and data safety is an active research domain for us.