diff options
-rw-r--r-- | README.md | 257 | ||||
-rw-r--r-- | config.yaml | 114 | ||||
-rw-r--r-- | matterpuppeter.py | 320 |
3 files changed, 691 insertions, 0 deletions
diff --git a/README.md b/README.md new file mode 100644 index 0000000..584391d --- /dev/null +++ b/README.md @@ -0,0 +1,257 @@ +# Matterpuppeter + +*An API plugin for Matterbridge, that creates IRC puppets* + +## Features + +Turns this: + +``` +<Someone> rPerson, rando: hi both +<Matterbridge> [j] <rPerson> hello +<Matterbridge> [mx] <rando> Hi! +<Matterbridge> [irc2] <user> can i join in? +``` + +into this: + +``` +<Someone> rPerson|j, rando|mx: hi both +<rPerson|j> hello +--> rando|mx (934782619@matterpuppeter) (@rando:matrix.org [matrix.mxo]) has joined #chat +<rando|mx> Hi! +<user|irc2> can i join in? +``` + +- *Lazy puppeting*: creates a puppet only when users join or talk +- *Client limit*: limits how many puppets can connect at the same time +- *Built-in configurable pastebinning* of long codeblocks +- Works with *Matterbridge*'s REST API +- *Limited quit/join flood*: keeps connections alive during a Matterbridge + restart + - Puppets are marked as "away" until Matterpuppeter can reconnect to the + Matterbridge API + +In other words: + +``` + Any protocol supported + by Matterbridge + (e.g. XMPP, Discord, Matrix) + | + | + v + Matterbridge + | + | Matterbridge API + v + Matterpuppeter + | + | IRC protocol + v + IRC network + (e.g. OFTC, Libera.Chat) + | + | + v + IRC channel + (e.g. ##coffee) +``` + +This is not a standalone bridge. + +## Caveats + +- *Lazy puppeting*: while this makes client counts lower, it still means + people who do not have a puppet can still watch the channel on the other + side, without users on IRC knowing. +- *Portalling-style bridging* is not supported, on-purpose +- *Cannot bridge bans*: matterbridge does not support bridging bans and kicks +- *Private messages* are not supported +- *Connecting to multiple networks* is not supported - run Matterpuppeter + multiple times for each network instead + +## How to + +This currently requires a fork of Matterbridge: https://git.vitali64.duckdns.org/misc/matterbridge.git + +Create a Matterbridge config by following [this guide](https://github.com/42wim/matterbridge/wiki/How-to-create-your-config). + +Add the following at the top of your `matterbridge.toml`: + +``` +[api.matterpuppeter] +NoSendJoinPart=false +ShowJoinPart=true +BindAddress="127.0.0.1:4242" +Buffer=1000 +RemoteNickFormat="{NICK}-{LABEL}" +Label="IRC" +``` + +If you're bridging to multiple IRC networks and want to use Matterpuppeter on +all of them, you'll need one Matterpuppeter instance per network. + +Add the following to one of the gateways you have configured: + +``` +[[gateway]] +name="gateway1" +<...> + + [[gateway.inout]] + account="api.matterpuppeter" + channel="api" + +<...> +``` + +Now, you can configure Matterpuppeter itself. The configuration is in +`config.yaml`, and is extensively documented. Do make sure to customise +it, so it fits your bridging setup. + +Here is a simple config file that will bridge to a `##Meow` channel on Libera.Chat: + +``` +irc: + host: "irc.libera.chat" + port: 6697 + tls: true + nick: "_Bridger" + gecos: "Matterpuppeter listener bot" + message_limit: 380 + client_limit: 6 + + sasl: + enable: false + +paste: + enable: false + +ident: + enable: false + +api: + host: "http://localhost:4242" + account: "api.matterpuppeter" + +gateway: + "gateway1": "##Meow" +``` + +### Pastebinning support + +Long code blocks can be pastebinned to avoid spamming IRC. This can be useful +when it is common practise to paste snippets of code which would cause spam on +the IRC side. For example, the following message: + +``` +<f_> Here's my code: + + | #include <stdio.h> + | + | int main(int argc, char **argv) + | { + | print("This is a message\n"); + | + | return 0; + | } + +``` + +can be bridged as: + +``` +<f_-X> Here's my code: +<f_-X> https://pastebin.matterpuppeter/<paste id here> +``` + +To configure this behaviour, first create a directory where Matterpuppeter will +write code snippets to: + +``` +$ mkdir ~/pastes +``` + +Then, let an HTTP server serve files from that directory. For example, when using +nginx: + +``` +server { + server_name paste.matterpuppeter; + listen 80; + root /home/_mp/pastes; + + location / { + index index.html; + } +} +``` + +This is a very minimal configuration, which may not be suitable in production, +but it gives an overall idea of how things should be set up. Also make sure that +the `pastes` directory is readable by the user nginx runs as (typically `www-data`). + +Finally, you can configure Matterpuppeter to make use of that directory: + +``` +paste: + enable: true + domain: "http://paste.matterpuppeter/" + dir: "/home/_mp/pastes" + maxlines: 5 +``` + +Restart your HTTP server and Matterpuppeter, long code blocks should now be +pastebinned! You will need to set up cleaning up yourself at this point +however, but this is left as an exercise to the reader. + +### Identd + +Matterpuppeter can integrate with an ident daemon for the purposes of identifying +puppets. It does this by writing to an *ident file*, which would then be read by +your system's ident server (such as `oidentd`) and sent to the IRC server. After this +is done Matterpuppeter clears the contents of the file. + +To do this, your `/etc/oidentd.conf` file must allow the user Matterpuppeter runs as to +spoof ident queries: + +``` +user "_mp" { + default { + allow spoof + allow spoof_all + } +} +``` + +Then create an `.oidentd.conf` file in the home directory of the Matterpuppeter user +while logged in as that user, and make sure `oidentd` can read it: + +``` +$ touch ~/.oidentd.conf +$ chmod 644 ~/.oidentd.conf +$ chmod 711 ~ +``` + +Finally, configure Matterpuppeter to write to that file: + +``` +<...> +ident: + enable: true + file: "/home/_mp/.oidentd.conf" + format: "global { reply \"%user%\" }" + username: "listener" +<...> +``` + +Then restart your ident daemon and Matterpuppeter, the puppets it creates should +now be properly idented! + +## TODO + +* Possibly full puppeting support (as in, onboarding every single user joined on + the other side), probably not as default however +* sed-style (`s///`) or SMS-style edits +* Cleanup diff --git a/config.yaml b/config.yaml new file mode 100644 index 0000000..271101b --- /dev/null +++ b/config.yaml @@ -0,0 +1,114 @@ +## +# Matterpuppeter configuration +# Also see the README for more information on how to get it working. +## + +# +# IRC config +# +irc: + # The IRC network/server host to connect to. + # Default: "localhost" + host: "localhost" + # The port to connect to, on the IRC server. Typically this is 6667 for + # plain, and 6697 for TLS. It is recommended to connect via TLS. + # Default: 6667 + port: 6667 + # Enable this if you wish to connect to a port via TLS. + # Default: false + tls: false + # The main bot's nickname. This bot will listen to messages on channels + # mentionned below. + # Default: "Matterpuppeter" + nick: "Matterpuppeter" + # The gecos or realname we want to bot to have. + # Default: "Matterbridge" + gecos: "Matterbridge" + # How many characters can be in a single message. IRC has a limit of 512 + # bytes, including the sender's nick!user@host, the command, target, etc. + # 380 here should be fine, unless you end up with extremely long nicknames. + # Default: 380 + message_limit: 380 + # How many puppets can be created. This depends on the network you're connecting + # to, most networks have limits on how many connections can come from a + # single host. Typically limits can be relaxed by requesting IRC network + # operators an I-Line and using an ident daemon. + # Default: 6 + client_limit: 6 + + # SASL configuration + sasl: + # If SASL authentication to e.g. NickServ should be enabled. Note that + # this requires that the network support SASL PLAIN (see the "sasl" + # IRCv3 CAP). If the network you're connecting to does not support + # SASL PLAIN, miniirc (IRC lib) will fallback to automatically messaging + # NickServ directly. + # Default: false + enable: false + username: "foobar" + password: "hunter2" + +# +# Pastebin config +# +# Sometimes people on other protocols paste long code blocks in the chat, which +# can spam IRC. When pastebinning is enabled, Matterpuppeter will replace any +# long code blocks with a link. Pastebinning long messages (in general, not just +# code blocks) is currently not implemented. +# +paste: + # Enable pastebinning. + # Default: false + enable: false + # On which domain pastebins will be served + # Default: "https://my.paste.bin/" + domain: "https://my.paste.bin/" + # On which directory pastes will be dropped in. The directory *must* already + # exist. + # Default: "/pastes" + dir: "/pastes" + # Maximum number of lines for a code block before it gets pastebinned. 5 is + # a sensible limit. The limit should not be too small, else that may annoy + # users, who have to open a link just to read e.g. one line of text/code. + # Default: 5 + maxlines: 5 + +# +# Ident config +# +# To identify puppets Matterpuppeter provides compatibility with the oidentd +# ident daemon. +# +ident: + # Enable identd. + # Default: false + enable: false + # oidentd user configuration file (must be empty and read-writeable) + # Default: ".oidentd.conf" + file: ".oidentd.conf" + # Format used to write to the ident file. %user% is replaced by the user-id + # hash. + # Default: "global { reply \"%user%\" }" + format: "global { reply \"%user%\" }" + # Ident the bot will use. + # Default: "listener" + username: "listener" + +# +# Matterbridge API config +# +api: + # Host on which the matterbridge API is listening on. This is `BindAddress` + # on the Matterbridge configuration file `matterbridge.toml` + host: "http://localhost:4242" + # The account name set in matterbridge.toml. + account: "api.liberap" + +# +# Matterbridge gateways +# +# This depends on which gateways you configured in matterbridge.toml +# [gateway]. +# +gateway: + "gateway1": "#channel" diff --git a/matterpuppeter.py b/matterpuppeter.py new file mode 100644 index 0000000..2f872d3 --- /dev/null +++ b/matterpuppeter.py @@ -0,0 +1,320 @@ +#!/usr/bin/python +# SPDX-License-Identifier: CC0 +# +# Written by: Ferass El Hafidi <vitali64pmemail@protonmail.com> +# +from sys import exit +import json +import string +import socket +import requests +import miniirc +import hashlib +import yaml +import re +from time import sleep +from requests.adapters import HTTPAdapter, Retry + +puppets = {} + +def connect_new_puppet(nickname, user_id, gaccount): + if not puppets.get(gaccount): + puppets[gaccount] = {} + + try: + print("* Puppet %s already connected, skipping" % puppets[gaccount][user_id]) + + if puppets[gaccount][user_id].connected == True: + return puppets[gaccount][user_id] + except KeyError: + pass + + # If there's too many puppets, disconnect the oldest one + if len(puppets[gaccount]) == client_limit: + puppets[gaccount][0].disconnect(msg="Connection closed for inactivity") + puppets[gaccount].pop(0) + + # Sanitize nickname + allowed_chars = string.digits + string.ascii_letters + "^|\\-_[]{}" + sanitized_nickname = nickname + for char in sanitized_nickname: + if char not in allowed_chars: + # Try to represent illegal characters with similar, allowed ones + if char == "(" or char == "<": + sanitized_nickname = sanitized_nickname.replace(char, "[") + elif char == ")" or char == ">": + sanitized_nickname = sanitized_nickname.replace(char, "]") + elif char == "/": + sanitized_nickname = sanitized_nickname.replace(char, "|") + elif char == "." or char == " ": + sanitized_nickname = sanitized_nickname.replace(char, "_") + else: + sanitized_nickname = sanitized_nickname.replace(char, "-") + if sanitized_nickname[0] in string.digits + '-': + sanitized_nickname = "_" + sanitized_nickname + + if (sanitized_nickname.endswith("Serv") or "." in nickname) and gaccount.startswith("irc."): + print("* Not creating puppet of a network service/server (%s)", sanitized_nickname) + return + + print("* Connecting new puppet %s (user_id %s) as %s" % (nickname, + user_id, sanitized_nickname)) + + puppets[gaccount][user_id] = miniirc.IRC( + irc_host, + irc_port, + sanitized_nickname, + realname="%s [%s]" % (user_id, gaccount), + auto_connect=False, + debug=False, + ssl=irc_tls, + persist=False, + quit_message="quit" + ) + puppets[gaccount][user_id].Handler('PRIVMSG', colon=False)(fail_on_pm) + userid_hash = hashlib.sha3_224(bytes(user_id, "utf-8")) + write_ident_file(str(userid_hash.hexdigest())) + + puppets[gaccount][user_id].connect() + + while puppets[gaccount][user_id].connected != True: + pass + + return puppets[gaccount][user_id] + +def write_ident_file(ident): + if ident_enabled: + print("* Writing to ident file for oidentd/ident2") + with open(ident_file, 'w') as file: + file.write(ident_fmt.replace("%user%", ident)) + +def get_puppet_client(gaccount, user_id): + try: + ret = puppets[gaccount][user_id] + except KeyError: + print("* Error finding puppet client %s (%s)" % (user_id, gaccount)) + return None + return ret + +def on_irc_msg(irc, hostmask, args): + nick = hostmask[0] + channel = args[0] + message = args[-1] + + for gw in puppets: + for puppet in puppets[gw]: + if puppets[gw][puppet].nick == nick: + return # don't act on own messages + + # Get on which gateway it was sent + for gw in gateways: + if gateways[gw] == channel: + gateway = gw + break + else: + return + + # Craft API message + api_message = { + "text": message, + "username": nick, + "userid": hostmask[1] + "@" + hostmask[2], + "gateway": gateway + } + + print("* Sending messages to gateway") + + req = requests.post(f"{api_host}/api/message", json=api_message) + + +def api_loop(): + away = False + sess = requests.Session() + sess_retries = Retry(total=10000, + backoff_factor=0.1, + status_forcelist=[ 500, 502, 503, 504 ]) + + sess.mount('http://', HTTPAdapter(max_retries=sess_retries)) + sess.mount('https://', HTTPAdapter(max_retries=sess_retries)) + + while True: + req = sess.get(f"{api_host}/api/stream", stream=True) + + try: + for line in req.iter_lines(): + if line: + # Unset away status for all puppets + if away: + for gw in puppets: + for puppet in puppets[gw]: + puppets[gw][puppet].send("AWAY") + away = False + message = json.loads(line.decode('utf-8')) + + if message.get("message") == "Not Found": + print("Error: Channel not found: %s" % channel) + return -1 + + if message["event"] == "api_connected": + print("Connected!") + elif message["event"] == "" or message["event"] == "user_action": # XXX: Probably a message + channel = gateways[message["gateway"]] + print("Bridging message event") + # Connect puppet if it does not exist already + puppet = connect_new_puppet(message["username"], message["userid"], message["account"]) + + if puppet: + # Have the puppet join the channel if it isn't joined already + puppet.send("JOIN", channel) + + # Parse message for long code blocks + if paste_enabled: + code_blocks = re.findall("^\\`\\`\\`(.*?)^\\`\\`\\`", message["text"], flags=re.S + re.M) + for code_block in code_blocks: + if code_block.count("\n") > paste_maxlines: + cb_checksum = hashlib.sha3_224(bytes(code_block, "utf-8")) + file = cb_checksum.hexdigest() + ".txt" + message["text"] = message["text"].replace("```" + code_block + "```", "%s%s" % (paste_domain, file)) + with open(paste_dir + "/" + file, mode='w') as file_paste: + print(code_block, file=file_paste) + else: + # Remove trailing ```'s + message["text"] = message["text"].replace("```" + code_block + "```", code_block) + + # Send the message + for line in message["text"].split("\n"): + if len(line) >= irc_msglen: + # Hacky... + buf = "" + i = 0 + for char in line: + i += 1 + if len(buf) < irc_msglen and i != (len(line) - 1): + buf += char + elif len(buf) == irc_msglen or i == (len(line) - 1): + puppet.msg(channel, buf) + buf = "" + else: + if line != "": + if message["event"] == "user_action": + puppet.me(channel, line) + else: + puppet.msg(channel, line) + sleep(0.5) + elif message["event"] == "join": + channel = gateways[message["gateway"]] + print("Bridging join event") + # Connect puppet if it does not exist already + puppet = connect_new_puppet(message["username"], message["userid"], message["account"]) + + if puppet: + # Have the puppet join the channel + puppet.send("JOIN", channel) + elif message["event"] == "leave": + channel = gateways[message["gateway"]] + # Check if puppet exists + puppet = get_puppet_client(message["account"], message["userid"]) + + if puppet == None: + # We have nothing to do + continue + puppet.send("PART", channel, "Leaving") + except (requests.exceptions.ConnectionError, + requests.exceptions.ChunkedEncodingError): + print("Disconnected from Matterbridge") + + # Set away status for all puppets + if away == False: + away = True + for gw in puppets: + for puppet in puppets[gw]: + puppets[gw][puppet].send("AWAY", "Matterbridge disconnected") + continue + +def fail_on_pm(irc, hostmask, args): + nick = hostmask[0] + target = args[0] + + if "\x01" in args[1]: + return + + if target == irc.nick: + irc.msg(nick, "[Automated message] This is a bridged puppet, private messages are currently unsupported - message not sent.") + +def on_irc_kick(irc, hostmask, args): + print("* Got kick") + if irc.nick == hostmask[0]: + print("* Trying to rejoin channel") + irc.join(args[0]) + +def clear_ident_file(irc, hostmask, args): + if ident_enabled: + print("* Clearing ident file") + # Clear ident file + with open(ident_file, 'w') as file: + file.write("\n") + +def main(): + # Set CTCP VERSION reply + miniirc.version = "Matterpuppeter ยท https://git.vitali64.duckdns.org/utils/matterpuppeter.git" + + # Set handlers + miniirc.Handler('001')(clear_ident_file) # Global handler + # Write ident for the bot + write_ident_file(ident_bot) + + # Connect to IRC + irc = miniirc.IRC( + irc_host, + irc_port, + irc_nick, + realname=irc_gecos, + debug=False, + ssl=irc_tls, + ns_identity=sasl_auth if sasl_enabled else None, + quit_message="shutting down", + auto_connect=True, + persist=True + ) + irc.Handler('PRIVMSG', colon=False)(on_irc_msg) + irc.Handler('NOTICE', colon=False)(on_irc_msg) + irc.Handler('KICK', colon=False)(on_irc_kick) + + + for gateway in gateways: + irc.send("JOIN", gateways[gateway]) + + api_loop() + +if __name__ == '__main__': + with open("config.yaml") as file: + config = yaml.load(file, Loader=yaml.FullLoader) + try: + mb_account = config["api"]["account"] + irc_host = config["irc"]["host"] + irc_port = config["irc"]["port"] + irc_tls = config["irc"]["tls"] + irc_nick = config["irc"]["nick"] + irc_gecos = config["irc"]["gecos"] + irc_msglen = config["irc"]["message_limit"] + api_host = config["api"]["host"] + ident_enabled = config["ident"]["enable"] + if ident_enabled: + ident_file = config["ident"]["file"] + ident_fmt = config["ident"]["format"] + ident_bot = config["ident"]["username"] + client_limit = config["irc"]["client_limit"] + paste_enabled = config["paste"]["enable"] + if paste_enabled: + paste_dir = config["paste"]["dir"] + paste_domain = config["paste"]["domain"] + paste_maxlines = config["paste"]["maxlines"] + sasl_enabled = config["irc"]["sasl"]["enable"] + if sasl_enabled: + sasl_auth = (config["irc"]["sasl"]["username"], + config["irc"]["sasl"]["password"]) + gateways = config["gateway"] + except KeyError: + print("Error: Configuration file is missing options") + exit(1) + main() |