python-telegram-bot/telegram/_messageentity.py

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

275 lines
13 KiB
Python
Raw Normal View History

2016-04-12 06:12:35 +02:00
#!/usr/bin/env python
#
# A library that provides a Python interface to the Telegram Bot API
2024-02-19 20:06:25 +01:00
# Copyright (C) 2015-2024
2016-04-12 06:12:35 +02:00
# Leandro Toledo de Souza <devs@python-telegram-bot.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Lesser Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Lesser Public License for more details.
#
# You should have received a copy of the GNU Lesser Public License
# along with this program. If not, see [http://www.gnu.org/licenses/].
2016-10-17 00:22:40 +02:00
"""This module contains an object that represents a Telegram MessageEntity."""
import copy
import itertools
from typing import TYPE_CHECKING, Dict, Final, List, Optional, Sequence
from telegram import constants
from telegram._telegramobject import TelegramObject
from telegram._user import User
from telegram._utils import enum
from telegram._utils.strings import TextEncoding
from telegram._utils.types import JSONDict
2020-10-06 19:28:40 +02:00
if TYPE_CHECKING:
from telegram import Bot
class MessageEntity(TelegramObject):
"""
This object represents one special entity in a text message. For example, hashtags,
usernames, URLs, etc.
Objects of this class are comparable in terms of equality. Two objects of this class are
Doc Fixes (#2253) * Render-fixes for BP * docs: fix simple typo, submition -> submission (#2260) There is a small typo in tests/test_bot.py. Should read `submission` rather than `submition`. * Type on rawapibot.py docstring * typo * Typo: Filters.document(s) * Typo fix * Doc fix for messageentity (#2311) * Add New Shortcuts to Chat (#2291) * Add shortcuts * Add a note * Add run_async Parameter to ConversationHandler (#2292) * Add run_async parameter * Update docstring * Update test to explicitly specify parameter * Fix test job queue * Add version added tag to docs * Update docstring Co-authored-by: Poolitzer <25934244+Poolitzer@users.noreply.github.com> * Doc nitpicking Co-authored-by: Poolitzer <25934244+Poolitzer@users.noreply.github.com> Co-authored-by: Hinrich Mahler <hinrich.mahler@freenet.de> * Fix rendering in messageentity Co-authored-by: Bibo-Joshi <hinrich.mahler@freenet.de> Co-authored-by: zeshuaro <joshuaystang@gmail.com> Co-authored-by: Poolitzer <25934244+Poolitzer@users.noreply.github.com> * fix: type hints for TelegramError changed :class:`telegram.TelegramError` to :class:`telegram.error.TelegramError` * fix: the error can be more then just a Telegram error * Doc fix for inlinekeyboardbutton.py added missing colon which broke rendering * fix: remove context argument and doc remark look at us already being in post 12 * use rtd badge * filters doc fixes * fix some rendering * Doc & Rendering fixes for helpers.py Co-authored-by: Tim Gates <tim.gates@iress.com> Co-authored-by: Harshil <37377066+harshil21@users.noreply.github.com> Co-authored-by: zeshuaro <joshuaystang@gmail.com> Co-authored-by: Poolitzer <25934244+Poolitzer@users.noreply.github.com> Co-authored-by: Harshil <ilovebhagwan@gmail.com>
2021-02-01 17:59:39 +01:00
considered equal, if their :attr:`type`, :attr:`offset` and :attr:`length` are equal.
Args:
type (:obj:`str`): Type of the entity. Can be :attr:`MENTION` (@username),
:attr:`HASHTAG` (#hashtag), :attr:`CASHTAG` ($USD), :attr:`BOT_COMMAND`
(/start@jobs_bot), :attr:`URL` (https://telegram.org),
:attr:`EMAIL` (do-not-reply@telegram.org), :attr:`PHONE_NUMBER` (+1-212-555-0123),
:attr:`BOLD` (**bold text**), :attr:`ITALIC` (*italic text*), :attr:`UNDERLINE`
(underlined text), :attr:`STRIKETHROUGH`, :attr:`SPOILER` (spoiler message),
:attr:`BLOCKQUOTE` (block quotation), :attr:`CODE` (monowidth string), :attr:`PRE`
(monowidth block), :attr:`TEXT_LINK` (for clickable text URLs), :attr:`TEXT_MENTION`
(for users without usernames), :attr:`CUSTOM_EMOJI` (for inline custom emoji stickers).
.. versionadded:: 20.0
Added inline custom emoji
2024-02-08 18:36:28 +01:00
.. versionadded:: 20.8
Added block quotation
offset (:obj:`int`): Offset in UTF-16 code units to the start of the entity.
length (:obj:`int`): Length of the entity in UTF-16 code units.
url (:obj:`str`, optional): For :attr:`TEXT_LINK` only, url that will be opened after
Documentation Improvements (#2008) * Minor doc updates, following official API docs * Fix spelling in Defaults docstrings * Clarify Changelog of v12.7 about aware dates * Fix typo in CHANGES.rst (#2024) * Fix PicklePersistence.flush() with only bot_data (#2017) * Update pylint in pre-commit to fix CI (#2018) * Add Filters.via_bot (#2009) * feat: via_bot filter also fixing a small mistake in the empty parameter of the user filter and improve docs slightly * fix: forgot to set via_bot to None * fix: redoing subclassing to copy paste solution * Cosmetic changes Co-authored-by: Hinrich Mahler <hinrich.mahler@freenet.de> * Update CHANGES.rst Fixed Typo Co-authored-by: Bibo-Joshi <hinrich.mahler@freenet.de> Co-authored-by: Poolitzer <25934244+Poolitzer@users.noreply.github.com> * Update downloads badge, add info on IRC Channel to Getting Help section * Remove RegexHandler from ConversationHandlers Docs (#1973) Replaced RegexHandler with MessageHandler, since the former is deprecated * Fix Filters.via_bot docstrings * Add notes on Markdown v1 being legacy mode * Fixed typo in the Regex doc.. (#2036) * Typo: Spelling * Minor cleanup from #2043 * Document CommandHandler ignoring channel posts * Doc fixes for a few telegram.ext classes * Doc fixes for most `telegram` classes. * pep-8 forgot the hard wrap is at 99 chars, not 100! fixed a few spelling mistakes too. * Address review and made rendering of booleans consistent True, False, None are now rendered with ``bool`` wherever they weren't in telegram and telegram.ext classes. * Few doc fixes for inline* classes As usual, docs were cross-checked with official tg api docs. * Doc fixes for telegram/files classes As usual, docs were cross-checked with official tg api docs. * Doc fixes for telegram.Game Mostly just added hyperlinks. And fixed message length doc. As usual, docs were cross-checked with official tg api docs. * Very minor doc fix for passportfile.py and passportelementerrors.py Didn't bother changing too much since this seems to be a custom implementation. * Doc fixes for telegram.payments As usual, cross-checked with official bot api docs. * Address review 2 Few tiny other fixes too. * Changed from ``True/False/None`` to :obj:`True/False/None` project-wide. Few tiny other doc fixes too. Co-authored-by: Robert Geislinger <mitachundkrach@gmail.com> Co-authored-by: Poolitzer <25934244+Poolitzer@users.noreply.github.com> Co-authored-by: GauthamramRavichandran <30320759+GauthamramRavichandran@users.noreply.github.com> Co-authored-by: Mahesh19 <maheshvagicherla99438@gmail.com> Co-authored-by: hoppingturtles <ilovebhagwan@gmail.com>
2020-08-24 19:35:57 +02:00
user taps on the text.
user (:class:`telegram.User`, optional): For :attr:`TEXT_MENTION` only, the mentioned
user.
language (:obj:`str`, optional): For :attr:`PRE` only, the programming language of
Documentation Improvements (#2008) * Minor doc updates, following official API docs * Fix spelling in Defaults docstrings * Clarify Changelog of v12.7 about aware dates * Fix typo in CHANGES.rst (#2024) * Fix PicklePersistence.flush() with only bot_data (#2017) * Update pylint in pre-commit to fix CI (#2018) * Add Filters.via_bot (#2009) * feat: via_bot filter also fixing a small mistake in the empty parameter of the user filter and improve docs slightly * fix: forgot to set via_bot to None * fix: redoing subclassing to copy paste solution * Cosmetic changes Co-authored-by: Hinrich Mahler <hinrich.mahler@freenet.de> * Update CHANGES.rst Fixed Typo Co-authored-by: Bibo-Joshi <hinrich.mahler@freenet.de> Co-authored-by: Poolitzer <25934244+Poolitzer@users.noreply.github.com> * Update downloads badge, add info on IRC Channel to Getting Help section * Remove RegexHandler from ConversationHandlers Docs (#1973) Replaced RegexHandler with MessageHandler, since the former is deprecated * Fix Filters.via_bot docstrings * Add notes on Markdown v1 being legacy mode * Fixed typo in the Regex doc.. (#2036) * Typo: Spelling * Minor cleanup from #2043 * Document CommandHandler ignoring channel posts * Doc fixes for a few telegram.ext classes * Doc fixes for most `telegram` classes. * pep-8 forgot the hard wrap is at 99 chars, not 100! fixed a few spelling mistakes too. * Address review and made rendering of booleans consistent True, False, None are now rendered with ``bool`` wherever they weren't in telegram and telegram.ext classes. * Few doc fixes for inline* classes As usual, docs were cross-checked with official tg api docs. * Doc fixes for telegram/files classes As usual, docs were cross-checked with official tg api docs. * Doc fixes for telegram.Game Mostly just added hyperlinks. And fixed message length doc. As usual, docs were cross-checked with official tg api docs. * Very minor doc fix for passportfile.py and passportelementerrors.py Didn't bother changing too much since this seems to be a custom implementation. * Doc fixes for telegram.payments As usual, cross-checked with official bot api docs. * Address review 2 Few tiny other fixes too. * Changed from ``True/False/None`` to :obj:`True/False/None` project-wide. Few tiny other doc fixes too. Co-authored-by: Robert Geislinger <mitachundkrach@gmail.com> Co-authored-by: Poolitzer <25934244+Poolitzer@users.noreply.github.com> Co-authored-by: GauthamramRavichandran <30320759+GauthamramRavichandran@users.noreply.github.com> Co-authored-by: Mahesh19 <maheshvagicherla99438@gmail.com> Co-authored-by: hoppingturtles <ilovebhagwan@gmail.com>
2020-08-24 19:35:57 +02:00
the entity text.
custom_emoji_id (:obj:`str`, optional): For :attr:`CUSTOM_EMOJI` only, unique identifier
of the custom emoji. Use :meth:`telegram.Bot.get_custom_emoji_stickers` to get full
information about the sticker.
.. versionadded:: 20.0
Attributes:
type (:obj:`str`): Type of the entity. Can be :attr:`MENTION` (@username),
:attr:`HASHTAG` (#hashtag), :attr:`CASHTAG` ($USD), :attr:`BOT_COMMAND`
(/start@jobs_bot), :attr:`URL` (https://telegram.org),
:attr:`EMAIL` (do-not-reply@telegram.org), :attr:`PHONE_NUMBER` (+1-212-555-0123),
:attr:`BOLD` (**bold text**), :attr:`ITALIC` (*italic text*), :attr:`UNDERLINE`
(underlined text), :attr:`STRIKETHROUGH`, :attr:`SPOILER` (spoiler message),
:attr:`BLOCKQUOTE` (block quotation), :attr:`CODE` (monowidth string), :attr:`PRE`
(monowidth block), :attr:`TEXT_LINK` (for clickable text URLs), :attr:`TEXT_MENTION`
(for users without usernames), :attr:`CUSTOM_EMOJI` (for inline custom emoji stickers).
.. versionadded:: 20.0
Added inline custom emoji
2024-02-08 18:36:28 +01:00
.. versionadded:: 20.8
Added block quotation
offset (:obj:`int`): Offset in UTF-16 code units to the start of the entity.
length (:obj:`int`): Length of the entity in UTF-16 code units.
url (:obj:`str`): Optional. For :attr:`TEXT_LINK` only, url that will be opened after
user taps on the text.
user (:class:`telegram.User`): Optional. For :attr:`TEXT_MENTION` only, the mentioned
user.
language (:obj:`str`): Optional. For :attr:`PRE` only, the programming language of
the entity text.
custom_emoji_id (:obj:`str`): Optional. For :attr:`CUSTOM_EMOJI` only, unique identifier
of the custom emoji. Use :meth:`telegram.Bot.get_custom_emoji_stickers` to get full
information about the sticker.
.. versionadded:: 20.0
"""
__slots__ = ("custom_emoji_id", "language", "length", "offset", "type", "url", "user")
2020-10-06 19:28:40 +02:00
def __init__(
self,
type: str, # pylint: disable=redefined-builtin
2020-10-06 19:28:40 +02:00
offset: int,
length: int,
url: Optional[str] = None,
user: Optional[User] = None,
language: Optional[str] = None,
custom_emoji_id: Optional[str] = None,
*,
api_kwargs: Optional[JSONDict] = None,
2020-10-06 19:28:40 +02:00
):
super().__init__(api_kwargs=api_kwargs)
# Required
2023-02-02 18:55:07 +01:00
self.type: str = enum.get_member(constants.MessageEntityType, type, type)
self.offset: int = offset
self.length: int = length
# Optionals
2023-02-02 18:55:07 +01:00
self.url: Optional[str] = url
self.user: Optional[User] = user
self.language: Optional[str] = language
self.custom_emoji_id: Optional[str] = custom_emoji_id
self._id_attrs = (self.type, self.offset, self.length)
self._freeze()
@classmethod
def de_json(
cls, data: Optional[JSONDict], bot: Optional["Bot"] = None
) -> Optional["MessageEntity"]:
"""See :meth:`telegram.TelegramObject.de_json`."""
data = cls._parse_data(data)
if not data:
return None
data["user"] = User.de_json(data.get("user"), bot)
2016-05-24 01:31:36 +02:00
return super().de_json(data=data, bot=bot)
2016-04-17 12:43:09 +02:00
@staticmethod
def adjust_message_entities_to_utf_16(
text: str, entities: Sequence["MessageEntity"]
) -> Sequence["MessageEntity"]:
"""Utility functionality for converting the offset and length of entities from
Unicode (:obj:`str`) to UTF-16 (``utf-16-le`` encoded :obj:`bytes`).
Tip:
Only the offsets and lengths calulated in UTF-16 is acceptable by the Telegram Bot API.
If they are calculated using the Unicode string (:obj:`str` object), errors may occur
when the text contains characters that are not in the Basic Multilingual Plane (BMP).
For more information, see `Unicode <https://en.wikipedia.org/wiki/Unicode>`_ and
`Plane (Unicode) <https://en.wikipedia.org/wiki/Plane_(Unicode)>`_.
2024-07-12 17:40:42 +02:00
.. versionadded:: 21.4
Examples:
Below is a snippet of code that demonstrates how to use this function to convert
entities from Unicode to UTF-16 space. The ``unicode_entities`` are calculated in
Unicode and the `utf_16_entities` are calculated in UTF-16.
.. code-block:: python
text = "𠌕 bold 𝄢 italic underlined: 𝛙𝌢𑁍"
unicode_entities = [
MessageEntity(offset=2, length=4, type=MessageEntity.BOLD),
MessageEntity(offset=9, length=6, type=MessageEntity.ITALIC),
MessageEntity(offset=28, length=3, type=MessageEntity.UNDERLINE),
]
utf_16_entities = MessageEntity.adjust_message_entities_to_utf_16(
text, unicode_entities
)
await bot.send_message(
chat_id=123,
text=text,
entities=utf_16_entities,
)
# utf_16_entities[0]: offset=3, length=4
# utf_16_entities[1]: offset=11, length=6
# utf_16_entities[2]: offset=30, length=6
Args:
text (:obj:`str`): The text that the entities belong to
entities (Sequence[:class:`telegram.MessageEntity`]): Sequence of entities
with offset and length calculated in Unicode
Returns:
Sequence[:class:`telegram.MessageEntity`]: Sequence of entities
with offset and length calculated in UTF-16 encoding
"""
# get sorted positions
positions = sorted(itertools.chain(*((x.offset, x.offset + x.length) for x in entities)))
accumulated_length = 0
# calculate the length of each slice text[:position] in utf-16 accordingly,
# store the position translations
position_translation: Dict[int, int] = {}
for i, position in enumerate(positions):
last_position = positions[i - 1] if i > 0 else 0
text_slice = text[last_position:position]
accumulated_length += len(text_slice.encode(TextEncoding.UTF_16_LE)) // 2
position_translation[position] = accumulated_length
# get the final output entites
out = []
for entity in entities:
translated_positions = position_translation[entity.offset]
translated_length = (
position_translation[entity.offset + entity.length] - translated_positions
)
new_entity = copy.copy(entity)
with new_entity._unfrozen():
new_entity.offset = translated_positions
new_entity.length = translated_length
out.append(new_entity)
return out
ALL_TYPES: Final[List[str]] = list(constants.MessageEntityType)
"""List[:obj:`str`]: A list of all available message entity types."""
BLOCKQUOTE: Final[str] = constants.MessageEntityType.BLOCKQUOTE
""":const:`telegram.constants.MessageEntityType.BLOCKQUOTE`
.. versionadded:: 20.8
"""
BOLD: Final[str] = constants.MessageEntityType.BOLD
""":const:`telegram.constants.MessageEntityType.BOLD`"""
BOT_COMMAND: Final[str] = constants.MessageEntityType.BOT_COMMAND
""":const:`telegram.constants.MessageEntityType.BOT_COMMAND`"""
CASHTAG: Final[str] = constants.MessageEntityType.CASHTAG
""":const:`telegram.constants.MessageEntityType.CASHTAG`"""
CODE: Final[str] = constants.MessageEntityType.CODE
""":const:`telegram.constants.MessageEntityType.CODE`"""
CUSTOM_EMOJI: Final[str] = constants.MessageEntityType.CUSTOM_EMOJI
""":const:`telegram.constants.MessageEntityType.CUSTOM_EMOJI`
.. versionadded:: 20.0
"""
EMAIL: Final[str] = constants.MessageEntityType.EMAIL
""":const:`telegram.constants.MessageEntityType.EMAIL`"""
EXPANDABLE_BLOCKQUOTE: Final[str] = constants.MessageEntityType.EXPANDABLE_BLOCKQUOTE
""":const:`telegram.constants.MessageEntityType.EXPANDABLE_BLOCKQUOTE`
2024-06-07 16:52:22 +02:00
.. versionadded:: 21.3
"""
HASHTAG: Final[str] = constants.MessageEntityType.HASHTAG
""":const:`telegram.constants.MessageEntityType.HASHTAG`"""
ITALIC: Final[str] = constants.MessageEntityType.ITALIC
""":const:`telegram.constants.MessageEntityType.ITALIC`"""
MENTION: Final[str] = constants.MessageEntityType.MENTION
""":const:`telegram.constants.MessageEntityType.MENTION`"""
PHONE_NUMBER: Final[str] = constants.MessageEntityType.PHONE_NUMBER
""":const:`telegram.constants.MessageEntityType.PHONE_NUMBER`"""
PRE: Final[str] = constants.MessageEntityType.PRE
""":const:`telegram.constants.MessageEntityType.PRE`"""
SPOILER: Final[str] = constants.MessageEntityType.SPOILER
""":const:`telegram.constants.MessageEntityType.SPOILER`
.. versionadded:: 13.10
"""
STRIKETHROUGH: Final[str] = constants.MessageEntityType.STRIKETHROUGH
""":const:`telegram.constants.MessageEntityType.STRIKETHROUGH`"""
TEXT_LINK: Final[str] = constants.MessageEntityType.TEXT_LINK
""":const:`telegram.constants.MessageEntityType.TEXT_LINK`"""
TEXT_MENTION: Final[str] = constants.MessageEntityType.TEXT_MENTION
""":const:`telegram.constants.MessageEntityType.TEXT_MENTION`"""
UNDERLINE: Final[str] = constants.MessageEntityType.UNDERLINE
""":const:`telegram.constants.MessageEntityType.UNDERLINE`"""
URL: Final[str] = constants.MessageEntityType.URL
""":const:`telegram.constants.MessageEntityType.URL`"""