Browse All Ideas

Browse all ideas generated at the T&S Hackathons

The T&S Hackathon's goal is to identify and build innovative solutions that protect users no matter where they roam online. To do that, we hold hackathons both remotely and at various locations around the globe to ensure everyone has the chance to share a big idea that will improve user safety.

T&S Hackathon Idea List

We’re always generating interesting ideas at T&S Hackathons! If you’d like to learn more about any of the below ideas or be connected to the groups that came up with the idea, please email hello [at] anytime! Note these ideas are in no particular order (be sure to use the filter options below)

Develop an LLM to predict the age of a user with a high degree of confidence. A dataset to train, test, and validate the model could be constructed by verifying the age of a fixed number of users (through parental control features in the app like Instagram family center, Google family link) and then analyzing specific factors associated w/ them to understand trends: The age the accounts that these users are connected to Login times of the users of a specific age If the predicted age of the user doesn’t match with their reported age, we will take action like not showing certain content to them (like politics, sexually inappropriate content) etc.Child SafetySan Francisco 2023
Undersight Board
Voice of the Community System for Policy Development and Localization – An LLM-generated Persona Building system for moderators, policy managers, and operations teams to generate community specific input for T&S policy.GovernanceSan Francisco 2023Top Idea
A generative chatbot that empowers trust professionals. This chatbot will be trained on all publicly available trust and safety information across the internet, including transparency reports, video transcripts from trust and safety talks, current regulations, and researcher input.GovernanceSan Francisco 2023
Safe Speak
AI chatbot that empowers users to take control of distressing situations. Prompts users during a challenging exchange to report content and automatically plugs into existing reporting flows.User EducationSan Francisco 2023
Safety Interventions
Use behavioral analytics, fraud detection machine learning models, textual analysis to detect scam and fraud signals in user to user interaction Provide warnings and fraud education to users to prevent further engagement with bad actors Highly customizable for each platform to address their needs.Scams & FraudSan Francisco 2023
Tool that uses AI to optimize communications and interventions to each individual user in a way that maximizes comprehension and application Using both proprietary user data and public data the model: Learns who the user is Adapts to their understanding and usage of product Helps protect users with tailored communication and education.PolicySan Francisco 2023
SafetyGPT is a web-based interface that enables users to quickly access easy-to-understand information about how to stay safe or take action when experiencing a harm on top platforms. It’s your one stop shop to knowing all things about policy, community guidelines and safety and integrity tools.PolicySan Francisco 2023
Generating synthetic datasets to train models on harmful content. Real-world datasets to train models on harmful content are silo-ed and insufficient to support emerging platform needs. We propose UnSafeSets: generating synthetic datasets to train models on harmful, high sensitivity + low prevalence content.Harmful ContentSan Francisco 2023
AI Safety Network
AI-powered connection model for sharing cross-platform abuse
Any industry could spin up their own central clearing house for sharing hashed content identifying abusive users E.g., make it easy for Bumble to tell Tinder about a user who committed a sexual assault. Cross-sector effort (public, private, and civic) to define policies and privacy standards.
Data SharingSan Francisco 2023
We are designing a T&S Educational Chatbot for social media platforms which will educate users on policies and abuses when they seem to be interacting with suspected violating content.User EducationSan Francisco 2023
XCALibR (Xplatform Content Abuse Library Reference): Cross-Platform Online Safety Collaboration Software. The database attempts to unify content policy and ethical standards across all platforms; to find a common ground. Based on that policy, ML models are built to learn from existing enforcement data that is accessible to every platform. Emphasizing transparency, tolerance level customizability & inclusivity.Data SharingSan Francisco 2023Top Idea
AI simplifies the user agreement process by comparing the user’s values with platform’s policies and terms of agreement and highlight differences in simple terms. If user is about to violate platform policies, AI informs user about violations. The long term of the goal of the tool will be ensuring accountability while maintaining privacy as a priority.TransparencySan Francisco 2023
One-Stop Policy Shop
We will leverage publicly available platform policies, vetted academic research on T&S, site safety centers, and existing regulations. In the short term, we will leverage DeepL and Google crowdsource for initial language translation for the LLMs. Long-term, we will red-team our LLM with researchers and civil society volunteers.PolicySan Francisco 2023Top Idea
B.L.O.B.I. (Big List of Bad Ideas)
An AI model designed to detect and track emerging hate groups by analyzing and learning from existing data sets of hate speech. In collaboration with trusted partners, we aim to continually refine the data set to improve accuracy and relevance of the training data. With TnS professionals, researchers, and NGOs in mind, this database will provide real-time updates on emerging hate speech patterns; helping them respond effectively to protect users.Data SharingSan Francisco 2023
Info-sharing consortium to facilitate the transfer of bad actor behavioral patterns. An information-sharing consortium to facilitate the transfer of bad actor behavioral patterns and signals between Trust & Safety teams WatchDog will consume shared information and develop user level AI models focused on behavior to detect and enforce accounts and networks that cause harm A continuous feedback-loop will allow platforms to feed data into WatchDog as patterns and behaviors change over time.Data SharingSan Francisco 2023
AI Intervention
AI intervention for checking and enhancing behavioral traits to detect scam activity. Platform allowing you to make purchases on them (for example UPIs) Platforms with detection systems Platforms looking for proactive solutions for detecting scam ActivityUser EducationSan Francisco 2023
Train an AI model using diverse multilingual takedown datasets to identify and flag instances of moderation evasion across platforms, enabling the labeling of posts and profiles that engage in such tactics. Outputs an API to layer over existing trust and safety frameworks to continuously detect new examples of evasion tactics Categorizes types of behaviors and communities associated with those behaviors and flags to humans, allowing platforms and moderators to keep up with multi-platform abuse Bi-directional model allows platforms to contribute to model collaboratively Data anonymized and behavioral focused to remove privacy and compliance risk Mimics GIFCT’s Hash-Sharing DatabaseData SharingSan Francisco 2023Top Idea
AI Gold Standards
Methodology to evaluate an AI system to eliminate biases, increase accuracy, and enhance contextualization.GovernanceSan Francisco 2023
A prosocial AI-powered chatbot that identifies and prevents negative online experiences. “Chippy* takes a different approach to community moderation, by empowering users themselves with better education, tools, and perks to have users be more invested and proactive in the communities they are in”. Build trust in products; positive impact on company growth and user engagement.User EducationSan Francisco 2023
A third-party AI-based application, using LLMs, response metrics, sentiment and speech analysis to analyze electronic communication and identify exclusionary behaviors such as bullying, discrimination, and hate. Aggregate corporate data to: Analyze text and speech to detect tone and prevent harassment Measure employees against their own baseline to detect patterns of exclusion or discrimination Support HR compliance Use trend analysis to validate companies’ compliance with DEI values in the workplace.User EducationSan Francisco 2023
SafeTrace Collaborative
Repository brings together data from various social media companies, collaborating to detect and combat malicious patterns associated with child sexual abuse and trafficking through advanced AI detection. STC is a first of it’s kind, open-source AI resource dedicated to protect children from malicious harm online. This unique repository brings together data from various social media companies, collaborating to detect and combat malicious patterns associated with child sexual abuse and trafficking through advanced AI detection. STC compiles and presents the data, mapping trends to deliver comprehensive insights and mitigate threats against children, ensuring their safety and wellbeing in the digital sphere.Data SharingSan Francisco 2023
Salama Button
A dedicated in-site button to improve accessibility and reduce turn-around time for victims of online abuse. There is an increase in online abuse of children with low reporting of cases of abuse. This is attributed to the complexity of existing reporting systems across all platforms that don’t accommodate the brain responses during a state of panic and fear, hence, The Salama button aims to improve accessibility and reduce turn-around time for victims of online abuse. This is through a personalized safety process.User EducationSan Francisco 2023
An interactive AI-driven safety plug-in to empower young users and help keep them safe online. Acorn helps drive awareness & educates a user on online safety through minors-friendly design and communication Grows with the child until they reach the age of 18 Acorn helps explain policies, can give guidance on what content is visible to the user and can provide safety advice.User EducationParis 2023
Unified Safety Platform
Unify and democratise access to T&S knowledge, solutions and signals Based on two pillars: 1) Provision of actionable regulatory information 2) Provision of plug and play content intelligence from open access and proprietary sourcesData SharingParis 2023
Think Twice
A custom AI chatbot will be put into the platform to make custom advices to align with community guidelines. The bot will then monitor and alert users when they try to publish content that might result in possible violations of community guidelines .User EducationParis 2023
Policy Mapper
A policy development and refinement tool to helps policy teams at new, and small platform create policies that are benchmarked against industry and legal standards for illegal content. This tool bridges the gap between the law and the interpretation of the law.PolicyParis 2023
Safe Sharing
A collaborative approach to reduce sexual harassment recidivism across platforms. Safe sharing creates a network between platforms and a third party knowledge source to be able to offer awareness raising programs to violating users so that they can safely return to platforms, as well as real-time insights on emerging threats to platforms.Data SharingParis 2023
Centralised portal which: (i) connects content reporters to the appropriate trusted flagger specialised in the relevant region or subject matter; (ii) provides useful sorting & review features for TFs in a cost-effective & scalable mannerGovernanceParis 2023
An independent body that is held accountable by the public via its elected board of directors, drawing in competitive and diverse talent to stay ahead of evolving online risks.GovernanceParis 2023
Nudge is a real-time, scam-specific detection system. It’s powered by AI and scam-specific datasets for LLM training, beginning with sextortion and growing the scam use cases over time. The scammer sends message to potential victim Nudge detects the high-risk message based on a scam-specific optimized dataset providing a risk score along with associated data around previous scam incidents User gets a nudge, informing them of the potential scam along with links to the nudge resource center Nudge resource center includes a scam-specific chatbot “NudgeAI”.AI / MLParis 2023Top Idea
V Game
“V” is an integration game that aims to crowdsource content moderation by letting users flag and moderate illegal content. Players earn points for accurately report and moderate violations, training algorithms to better detect the content and get rewarded for it.Content ModerationParis 2023
AI Safety Education Initiative
Creation and maintenance of modular materials for education and intervention.User EducationParis 2023
An open source platform sharing public records of cross platform abusive accounts. Our idea is a proactive online safety platform that seamlessly integrates with popular social media apps. It receives non-restricted data about an account enforced on one platform, uses it to check for matching or similar profiles on other platforms, AI compiles a ‘risk assessment’ and alerts users to potential fraudulent or malicious accounts.Data SharingParis 2023
A knowledge-sharing platform for T&S professionals to share actionable intelligence on emerging online threats, aimed at facilitating early detection and coordinated responses to new trends.Data SharingParis 2023Top Idea
#IsItReal is a web-based chatbot for individuals & citizen real time fact-checkers that helps them with 360° fact checking of media fileGovernanceSafety by Design 2024
3 Design Changes
Ranking on quality, not engagement Limit rates for new & untrusted users to impede over-targeting of large groups of users. Privacy settings by default.DesignSafety by Design 2024
Red Team Upstream
Create a joint government/civil society Safety-by-Design ‘Red Team’ working ‘upstream’ with emerging tech sources e.g. university depts/sci hubs. Later develop ‘downstream’ for companies.PartnershipsSafety by Design 2024
The Cultural Context Database
A global database of traditional cultural expressions to ensure that automated AI systems can make the policy exceptions needed to protect free speech and local cultures.Data SharingSafety by Design 2024
A Certification Program, run by a not-for-profit, to help platforms demonstrate their commitment to Safety by Design to their business partners.CertificationsSafety by Design 2024
Trust Tech Summit
Set minimum standards Simplicity & clarity in regulations T&S trainings by regulators for founders, legal, policy teams, etc.Industry StandardsSafety by Design 2024
User Safety as a Common Standard
Apps required to be assessed against Safety by Design Principles Empowering users to be safe by giving them visibility into apps Safety featuresIndustry StandardsSafety by Design 2024
A database of “protected likenesses” where biometric data is stored in matchable, but not reconstructible state (e.g. hashes or tokens). Platforms can submit suspicious photos and videos for comparison to this database. This centralized database would significantly reduce likeness theft through the enable rapid, actionable, and accurate matching mechanism, available to any participating platform, without any increased friction at onboarding.Data SharingSafety by Design 2024Top Idea
WhatsApp , What’s up?
Interrupting social dynamics of small group or 1:1 harms on End-to-End Encrypted platforms across many harm types More oversight on app and technology safety while mitigating the risks of E2E brings to these innovations Addressing the risks with proactive and preventive measure.DesignSafety by Design 2024
Bridging Safety Gaps
Sporadically include AI generated ads/content that are customized to users’ interests, along with a detailed list of things to look out for if they interact with the content/adUser EducationSafety by Design 2024
SafeRoute BluePrint
Next-generation assessment engine that helps tech startups evaluate their trust and safety features against industry stndards. Offers personalized recommendations and roadmaps tailored to each company for easy prioritization.DesignSafety by Design 2024
User-Driven Safety Scorecard
Create a global survey for users worldwide to evaluate safety risks on top 20 platforms. 7-point Likert scale on crucial safety dimensions.User EducationSafety by Design 2024
Safer Together: Inclusive Safety by Design
Building products inclusively in the Safety by Design process with members of the most at-risk communities for Trust and Safety issues such as bullying, fraud, and harassment. These include: People with disabilities People in gender transition Children Neurodivergent communities LGBTQ+ SeniorsDesignSafety by Design 2024
Project Safety Sidekick: Safety by Design AI Assistant
Our customized ‘Safety by Design’ AI assistant will take in the input to analyze similar features across the T&S industry and provide an output highlighting: a) what are the potential risk or harms of the feature or business b) what are the relevant user policies the platform should have in place c) what are the safety features that should be accessible to users of the app d) what are the data analysis methods that should be used to quantify the effectiveness of the safety features implementedDesignSafety by Design 2024Top Idea
Safety Unified Solution (SBD Toolkit)
A toolkit built and distributed by NGO/Professional organization that provides information and resources that can be easily implemented by any organizationPartnershipsSafety by Design 2024
Safety Steward (Safety Stu)
We propose a “forward defence” public tech AI agent available (opt-in) to each citizen that: Renders ToS readable by a real person (currently less than 1% read ToS) Synthesizes with cookie/tracking metadata and historical data governance info to make it clear to users in natural language where their data is collected and being sent Assists with dynamic rendering of sites for accessibility concernsDesignSafety by Design 2024
Standardizing T&S Evaluation for LLM-based models
Develop standardized framework for specific T&S threats to use between model validation and deployment for LLM models Create stakeholder hub for signatories (AI companies, investors) to co-create and adopt this frameworkIndustry StandardsSafety by Design 2024Top Idea
Synthesize new threats with LLMs that can red-team your current policy. Augment private, historical examples of abuse detected under your current policy for stress testing. Allow engineers to automatically generate real-time threats for ML development. Share synthetically generated data across companies for collaboration.DesignSafety by Design 2024Top Idea

Got an idea for the T&S community to work on? Let us know about it!