Project Link: https://github.com/bdim404/HackerNews-Summarize-Telegram-Bot

Welcome everyone to contribute to this project with pull requests. I will continue learning based on this project.

First, I learned the most basic library for writing Telegram bots in Python: python-telegram-bot.

The most basic usage is:

1
2
3
# Import the telegram library for writing
from telegram import Update
from telegram.ext import ApplicationBuilder, CommandHandler, ContextTypes
1
2
$ # This installs the latest stable release
$ pip install python-telegram-bot --upgrade

This is the first simplest template I learned to use Python to write bots, from https://python-telegram-bot.org/

1
2
3
4
5
6
7
8
9
from telegram import Update
from telegram.ext import ApplicationBuilder, CommandHandler, ContextTypes

async def hello(update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
    await update.message.reply_text(f'Hello {update.effective_user.first_name}')

app = ApplicationBuilder().token("YOUR TOKEN HERE").build()
app.add_handler(CommandHandler("hello", hello))
app.run_polling()

The first reference I used is the timer bot timerbot. In the program timerbot.py, I found that it can also write some logic in the function definition of buttons in telegram:

1
2
3
4
5
6
async def unset(update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
    """Remove the job if the user changed their mind."""
    chat_id = update.message.chat_id
    job_removed = remove_job_if_exists(str(chat_id), context)
    text = "Timer successfully cancelled!" if job_removed else "You have no active timer."
    await update.message.reply_text(text)

In addition, the functionality provided by the telegram library is very powerful, with many uses, including “preprocessing messages received by bots”, etc., and it uses await to pass parameters, which is used to define functions asynchronously, which actually has nothing to do with the library itself…

Echobot.py (https://docs.python-telegram-bot.org/en/v20.6/examples.echobot.html) I learned from this example that the content sent by users to the bot will be stored in the variable update.message.text.

1
2
3
async def echo(update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
    """Echo the user message."""
    await update.message.reply_text(update.message.text)

I also learned about the logging library (https://docs.python.org/zh-cn/3/howto/logging.html), mainly used to record errors and various logs:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# myapp.py
import logging
import mylib

def main():
    logging.basicConfig(filename='myapp.log', level=logging.INFO)
    logging.info('Started')
    mylib.do_something()
    logging.info('Finished')

if __name__ == '__main__':
    main()

If you run myapp.py, you should see in myapp.log:

1
2
3
INFO:root:Started
INFO:root:Doing something
INFO:root:Finished

I referred to a telegram-ChatGPTbot written by a big shot, who put the API and bot’s token in the .env configuration file and used the dotenv library to load the configuration file into the files that need it.

1
2
3
4
5
# Read .env file
load_dotenv()
# This reads the contents of the .env file. To assign the content later, you only need to use os.environ.get() or os.environ[].
        'api_key': os.environ['OPENAI_API_KEY'],
        'show_usage': os.environ.get('SHOW_USAGE', 'false').lower() == 'true',

In summarize-bot 1.0, I did not use the .env method to load the configuration file, but considering setting permissions and facilitating code iteration management, I will use it later.

Here’s a black magic trick using html2text that fernvenuevenue teacher gave me:

1
2
3
4
5
h2t = html2text.HTML2Text()
h2t.ignore_tables = True
h2t.ignore_images = True
h2t.google_doc = True
h2t.ignore_links = True

Since my bot sends the content to OpenAI for processing, I also learned the usage specifications of the OpenAI API.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
import os
import openai
openai.api_key = os.getenv("OPENAI_API_KEY")

completion = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ]
)
print(completion.choices[0].message)

October 24th Update Learned to use the urllib.parse library to validate links from the readnews link and prevent the bot from being abused.

1
2
3
4
# Validate if the domain comes from redhacker.news
def is_valid_link(link):
    parsed_url = urllib.parse.urlparse(link)
    return parsed_url.netloc == "readhacker.news"

And the pipreqs library, used to help write all the dependencies/libraries needed by all py files in the file into the requirements.txt file.

Example

1
2
$ pipreqs /home/project/location
Successfully saved requirements file in /home/project/location/requirements.txt

October 29th Update

Learned AsyncIO & Asynchronous Programming in Python

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
improt asyncio

async def main():
  task = asuncio.creat_task(other_function())
  print("a")
  await asyncio.sleep(1)
  print("b")
  
async def other_function():
  print("1")
  await asyncio.sleep(2)
  print("2")

That is, asynchronous in Python. After creating a task with:

1
task = asuncio.creat_task(other_function())

The other_function() will be executed synchronously when the main() function is resting, but it will be terminated when the first function ends. If you want to wait for the second function to finish, you need to add:

1
await task

Wait for the end of the task before ending the function.