From Data Defiance to Cyber Resilience: The Winners of Shell’s Cyber Threat Hackathon

Shell and MachineHack collectively hosted the ‘Cyber Threat Detection Hackathon‘, which kicked off on September 15 and concluded on November 10, 2023, signifying a crucial initiative in advancing cybersecurity solutions.

The challenge was to develop models that identify hidden code in text to enhance web application security and resilience against cyber threats.

The hackathon focused on detecting code in text, a method commonly used by malicious entities to breach systems and access data.

Hackers often hide malicious code in innocuous media like images, videos, or text files. This embedded code can be executed unknowingly, compromising security.

In the hackathon, the participants received a text with hidden source code. The code may have been lacking in source control markers and could be segmented within the text. Attendees had to identify and extract this code – the strategies to do that included pattern recognition, machine learning models, and natural language processing techniques.

The challenge required analysing text structures, identifying anomalies, and using algorithms to detect the embedded code. The participants adapted their methods to diverse text formats and unexpected code placements.

The hackathon was open to everyone except the employees and contractors of the organisers. And the jury consisted of the top leadership at Shell.

The Prize Money

The stakes were high since the winners would receive a grand sum of $2500 for the first prize. The second- and third-prize winners would take home $1200 and $700, respectively. That’s not all; the next ten runners-up would receive $60.

The Winners

Winner: Ramashish Gupta

Ramashish Gupta, an undergraduate at IIT Kharagpur, won the first prize in the hackathon. His approach involved a two-step training process: addressing challenges in code repetition and improving accuracy through dynamic text matching. “At the Shell Hackathon, I started with a pre-trained T5 model from Salesforce. I fine-tuned it using a two-step training process involving seq2seq training,” Gupta explained.

The challenges included the model’s inability to repeat code, causing errors. To address this, he added a pre-training step to teach the model to replicate code before actual training. Additionally, he implemented text matching for code extraction using dynamic matching. Extensive data analysis helped reduce issues like data inconsistency.

Check out the solution here.

1st Runner up: Mohan Krishna Gupta

Mohan Krishna Gupta, a fresh BTech graduate and an NLP engineer at Textify AI, secured the second position at the hackathon. His approach was an NLP question-answering task using an ensemble of RoBERTa and DeBERTa models.

“I experimented with different models and finally built an ensemble using RoBERTa and DeBERTa, training each for 30 epochs on Google Colab,” Gupta said. He then used PyTorch and HuggingFace transformers libraries for training. His strategy to handle the large context sizes was to create smaller chunks and modify the indices of the answer for each chunk.

“I often participate in hackathons as they always provide a good exposure to using the latest technologies to solve problems,” he said.

Check out the solution here.

2nd Runner up: Jatin Yadav

Jatin Yadav, a GCP data engineer at Cognizant, secured third place in the hackathon. A graduate in computer science, Yadav initially attempted manual code extraction before shifting to using LLMs, specifically FLAN-T5. He enhanced the model’s tokeniser to recognize specific programming tokens, improving its accuracy.

“I tried multiple LLMs, including BERT, LLaMA and FLAN-T5 variants. At last, I stuck with FLAN-T5 (xl variant:2.85 billion parameters) because of its portability, less training time and more accurate results.” Yadav also added tokens that were not recognised by the model ( “{“,”}”,”\” )and retrained it for better accuracy.

Check out the solution here.

Other winners

Apart from the top three winners, the challenge recognised Prabin Kumar Nayak, Roshan Rateria, Rajat Ranjan, Bhavyan Sahayata, Thangadurai Jayaraman, and Ayush Patel as the runners-up.

The ‘Cyber Threat Detection Hackathon’ provided a platform for emerging talents in AI and ML to demonstrate their skills in cybersecurity. The hackathon highlighted a collaboration between the technology sector and the developers’ community. The hackathon marked advancements in AI for cybersecurity challenges, suggesting future collaborative solutions.

The post From Data Defiance to Cyber Resilience: The Winners of Shell’s Cyber Threat Hackathon appeared first on Analytics India Magazine.