Public Data Sourcing & Collection Notice
Effective date: September 26, 2024
Summary
Grably, Inc. (“Grably”) collects and curates publicly available posts and associated metadata from open Telegram channels and groups. Our datasets may include verbatim public messages for research, analytics, and AI/ML training. We do not access private messages, invite-only or private channels, or content behind authentication, paywalls, or other access controls. Grably does not transfer ownership of Telegram content; customers receive a limited license to use our datasets for internal analytics and AI/ML training in compliance with applicable law. Our pipelines include safeguards to minimize incidental personal information, and we honor valid takedown requests.
What we collect
- Public posts and associated channel-level metadata from open Telegram channels/supergroups (e.g., channel title, public description, follower counts as publicly shown).
- Message-level elements necessary for analytics (e.g., post text, post timestamp, message ID, message type, links contained in posts).
- Provenance fields that allow auditing (e.g., source channel identifier, canonical message link where available, first-seen/last-seen timestamps, dataset version).
What we don't collect
- No private communications (no DMs, private channels, invite-only groups).
- No credentialed or paywalled content; no circumvention of technical access controls (e.g., CAPTCHAs, login walls, rate-limit bypassing).
- No intentional collection of personal information about identifiable individuals (e.g., phone numbers, email addresses, government IDs, precise GPS coordinates). Our pipelines include filters to detect/redact such items where technically feasible, and we remove content on valid notice.
How we collect
We do not deploy techniques intended to defeat platform technical measures.
Lawful basis & platform respect
Grably processes publicly available user-generated content for legitimate business purposes such as research, safety, trend analysis, product improvement, and (under license) AI/ML training and evaluation.
Customers are responsible for using our datasets in compliance with applicable law and platform policies. The Dataset contains third-party Telegram content (e.g., messages, reactions, channel metadata). Subject to the Agreement, Grably authorizes Customer to copy, process, transform, and create derivatives from the Dataset—including the third-party content it contains—for Customer's internal analytics and AI/ML training and evaluation. This authorization does not permit public redistribution of raw Telegram messages/media or creation of services that provide access to raw content, and Customers must comply with applicable law and platform terms.
Derived data. "Derived Data" (e.g., embeddings, model weights, labels, statistics) that does not contain or reconstruct verbatim third-party content may be used internally by Customer; models must not be designed to reproduce or attribute verbatim Telegram content.
Minimization and safeguards
- Pre-publication filtering. We apply pattern-based rules to exclude likely direct identifiers (e.g., phone numbers, emails, government IDs) and certain sensitive categories where feasible.
- Child safety. We do not knowingly collect from child-directed channels and remove such sources if identified.
- Security. Data is encrypted in transit and at rest, with role-based access controls and least-privilege practices.
Jurisdictional notes
For residents of certain U.S. states (e.g., California) or Countries (e.g., Germany), additional rights and disclosures may apply.
Grably does not intend to collect or sell "personal information." If you believe our datasets include your personal information, contact us at [email protected].
Changes to this notice
We may update this notice from time to time. Material changes will be posted here with a new "Last updated" date.
All rights remain with the original creators or Telegram. Customers’ use of the Dataset must comply with applicable copyright law and platform terms.