Policy on full-text search
The search capabilities of different fediverse software are varied.
GoToSocial limits text search to your own posts. source
I don't know what Akkoma does because the docs aren't loading for me right now. https://docs.akkoma.dev/stable/configuration/search/
Mastodon has full text search on public posts, with some limitations. https://docs.joinmastodon.org/admin/elasticsearch/
It's also technically possible for any node to be modified to add full text search across any post that comes across it. (i'm pretty sure our operator coc would forbid this but I don't know where it is atm, i checked information.websiteleague.org)
I think we should discuss this and decide on a stance before a node with this capability comes online.
ruby Mon 30 Sep 2024 1:43AM
@sirocyl In Akkoma's case, I think there's a built-in full text search based on Postgres' GIN indices, with config to use RUM indices (with a postgres extension) instead, or an external Mellisearch instance. Unsure what GTS supports (I can look into this), but it seems like if we wanted full-text search, we don't need to rely on ElasticSearch specifically
wenchcoat system Sun 29 Sep 2024 11:40PM
not being able to search my own chosts always felt terrible on cohost. I'd rather not reproduce that shortcoming. as a superset of that, I think I'm okay with authenticated (only!) node-local search, whether or not it's restricted to moderators.
sirocyl Mon 30 Sep 2024 12:43AM
@wenchcoat system largely agree here, I would like for the possibility to search locally or own-posts, as long as it doesn't require dragging Elasticsearch into the game.
vis Mon 30 Sep 2024 1:51AM
In my opinion, full-text search should not be enabled league wide (fwiw i don't think it can be enabled at all) but should be left up to node staff to determine whether it is a good or bad thing for their instance. as soon as it becomes an avenue for harassment, it should probably be turned off (but i think if someone's searching for topics and getting into fights that might be more an issue with that user)
Shel Mon 30 Sep 2024 4:57PM
I think full-text search can be incredibly useful for understanding conflicts and finding the source of information, or just looking for a good old post that you saw and loved. Not having it creates problems as much as having it has; but in the latter case we can moderate the undesirable behaviors associated with fulltext search. The primary reason we avoided having it on Mastodon for so long was because of KiwiFarms, which is no longer much of an issue and they would find it very difficult to get into the Weague. That said, fulltext search is very resource-heavy on any social media website, especially decentralized ones. It can make the cost of running a node much more expensive (and the more expensive it is to run a node the higher the barrier to people other than rich techies running one).
art semaphore Mon 30 Sep 2024 8:33PM
given it's always possible to download all your own posts and search them locally, i think having full text search for your own stuff on the server would be nice to have.
my only interaction with sitewide searching was on twitter, where ipv6-proponents text-searched my posts complaining about setting up ipv6 to send me ipv6 propaganda, which was utterly bizzare.
sirocyl Mon 30 Sep 2024 10:02PM
@art semaphore I've seen those IPv6 trolls too. What a weird quadrant. I'm generally in support of IPv6 myself; I'm not going out and namesearching to reply to the entire firehose about the topic, though, and I'm certainly not downplaying the issues it has in those replies.
sirocyl · Sun 29 Sep 2024 1:21PM
I'll put in a "no" on full-text search across instances in the League, but on a technical (dis-)merit rather than social; full-text search, at least in Mastodon, perhaps others, does depend on ElasticSearch, which is a hulking goddamn behemoth of software. Treehouse has very recently had to address an outage caused by it. It would balloon our server requirements, and we might have to up-provision and scale up our servers continuously as it churns through RAM, bandwidth and storage.